Searches open access art and museums for what you want. Builds an Excel spreadsheet of citations and info as you go. Free. Private. No sign-up.
Your picked images are saved here until you download them.
This is a small program that reads your spreadsheet and downloads every image in it to organized folders on your computer.
Step 1: Download your spreadsheet above.
Step 2: Download this tool.
Step 3: Put them in the same folder and double-click the tool.
Why not to ChatGPT, Midjourney, Stable Diffusion, DALL-E, Adobe Firefly – just some opinions.
"Scraped" means taken without asking. Scraping uses "Crawler Bots". A company writes programs that crawl the internet and download every image they find. Those images are stored on private company servers owned by wealthy shareholders. The shareholders are perhaps 1% of the population who are culturally homogenous.
Images are fed into a training process to connect their "styles" to linguistic terms like "oil painting" or "Van Gogh" or whatever. When you type a prompt, it generates new pixels via language. The output looks like art because it was blended from art, but it is arguably not art. The people who made the images were not asked and were not paid.
The extraction is not small. The major image generators appear to have trained on billions of images. Their output now competes directly with the people it was built from. An illustrator can spend years developing a recognizable style, only to watch a machine trained on that style sell a cheap imitation to anyone with twenty dollars and a prompt box. Fair use law was written to protect critics, teachers, scholars, and satirists. Using it to defend industrial-scale extraction for private profit is a different thing. And unlike music sampling, there is no real audit trail here. No artist can trace their work into the output, demand credit, or seek compensation. The line back to the source was cut on purpose.
Around 87 federal lawsuits are currently pending (2026) against image generators such as ChatGPT, Midjourney, Stable Diffusion, DALL-E, Adobe Firefly. Every image you generate with these tools carries an unresolved legal provenance question. Beware.
No US court has ruled on whether training on copyrighted images without license constitutes infringement. For anyone in education, publishing, nonprofits, or institutional work it is perhaps wiser for now to use creative commons images.
The image generation market sits around $10.5 billion/year. Midjourney has 107 employees and makes ~$500 million a year in revenue. DALL-E is part of OpenAI, worth nearly $157 billion dollars. The major corporate AI image generators are massively profitable — were trained on billions of images scraped from the internet.
Apple is worth $3.7 trillion. Microsoft is $3.1 trillion. Nvidia hit $3 trillion. OpenAI alone, a company that didn't exist a decade ago, is valued at $157 billion after one funding round in 2024. The entire AI industry is projected to add $15.7 trillion to global GDP by 2030.
The entire US arts and culture sector — every museum, library, archive, symphony, theater, arts nonprofit — contributes $1.2 trillion to GDP annually. AI comprises thirteen times the current annual output of US arts and culture combined.
A handful of men now control more of the world's infrastructure than any government. One of them (Musk) owns roughly three quarters of all active satellites, the machinery behind GPS, weather forecasting, and global communications. Six CEOs run companies worth more than the GDP of every nation on Earth except two.
The images that trained these systems came from EVERYBODY ELSE: taxpayers, archives, working artists, and the accumulated visual record of the human species.
The data scraping theft was sudden, quick, global. The ownership of the output is private and concentrated inside a demographic so narrow that you are certainly not in it.
Yes, prompting DALL-E or Midjourney burns energy. So does driving your Toyota or streaming Netflix or refrigerating your strawberries in January.
People often talk about as if typing a prompt is a singular ecological sin — but they ignore much larger, routine forms of damage. A single beef cheeseburger carries a much heavier material footprint than one image prompt because cows require land, feed, water, methane-producing digestion, slaughter, refrigeration, packaging, and trucking. The same goes for air travel, fast fashion, and endless consumer junk.
So be honest: AI image generation has an environmental cost, but it is usually being folded into a much bigger industrial mess, and moral panic about prompts can become a convenient way to avoid talking about capitalism, meat, energy, logistics, and scale. If you want to be an environmentalist, throw away your phone right now.
Image generation runs through data centers, and data centers need power, water, land, cooling systems, and transmission infrastructure. When one town fights a facility over water use, noise, tax giveaways, or grid strain, the problem does not vanish. The company just looks for a place with weaker resistance, cheaper land, or poorer people who have less power to say no. So the environmental issue is not just "AI uses electricity." The issue is where the burden gets dumped, who absorbs it, and how quickly wealthy firms can move extraction to poorer people than you. Worry about those people instead of Your Own Backyard for once.
Your phone or whatever you are reading this on is a daily, normalized object built on extraction, labor exploitation, surveillance, and energy use at massive scale. Most people will not throw it away because the device is now structurally tied to work, banking, maps, social life, medicine, and survival. The honest position is that we are all living inside systems of damage and dependency, and the real question is scale, necessity, and where the burden lands.
Humans are chameleons and thieves and always have been. Culture is made by borrowing, stealing, remixing, inheriting, misremembering, jamming, copying, and carrying things across borders, languages, and styles. There is no pure text with a single author – no "Homer."
We live by copying others – we are mimetic beings with arguably no stable sense of "self" in the first place. So who owns the art? Are there authors at all?
Museums know the issues better than anyone. Many "open source" archives are full of objects passed through black markets, by force under empire, looted in war, stripped from ritual life. It's all warped into stable oddities in glass boxes for us to gaze at – our possession as "public domain." Many institutions are genuinely careful about provenance. Many are not.
Artists survive this economy by declaring authorship – to market a self and make a living. They think about "art" in terms of an industrial system for pricing and owning property, land, objects. But this industrial system is very weird and not normal for human history.
Most human societies through time and space have never treated culture as private commodities in the first place because they did not operate as markets. Not every culture divides the world into "artists," "owners," and "buyers" the way capitalism does. The modern art market is not some pure and innocent place.
But worse – to say that certain objects belong to a certain "nation" is nationalism. It divides the planet into invisible lines of us and them. But who gets to decide what constitutes a nation, anyway? The people in charge – the few up top, the council, government, chiefs, kings, usually men – they get to define art for everyone?
Then how is all art not simply propaganda? To push a piece of art into a particular "culture" reeks of exactly the imperial thinking that most creative commons work seeks to relieve.
This tool is not pretending to touch some pure, untouched archive – that does not exist anywhere. The point is narrower and more honest. It works with images that institutions have already released under open-access terms. That's not perfect, but it's better than the newer corporate move: scrape everything at industrial scale, sever it from all human context, and sell their own brains back to the hordes of dumb consumers. It is a problem and I offer no solutions.
Museum records are written by professional curators. It is not Google. Pretend you are a scholar requesting something from a museum. Say what it's made of. Bronze sculpture, woodblock print, silk textile — museum records always describe the material. Use culture or period words. Artist names work great. Use words, not sentences.
Your keys are stored only in your browser. We never see them.
An API key is a free library card. You register, you get a code, you paste it once. That's it.
| Source | How to Get the Key |
|---|---|
| Smithsonian | api.data.gov/signup — fill out name and email, key arrives instantly |
| Europeana | pro.europeana.eu — click "Request API key" |
| Rijksmuseum | rijksmuseum.nl/rijksstudio — free account, key in settings |
| Harvard Art Museums | harvardartmuseums.org — click "Request a Key" |
| DPLA | dp.la developers — click "Request a Key" |
| Museum / Library | Key | What They Have |
|---|---|---|
| Metropolitan Museum of Art | None | 400,000+ works: painting, sculpture, armor, textiles, global |
| Art Institute of Chicago | None | 50,000+ CC0: Impressionism, Asian art, photography |
| Cleveland Museum of Art | None | 30,000+ CC0: medieval, Asian, American, African |
| SMK — National Gallery of Denmark | None | European painting, Danish art, works on paper |
| Wellcome Collection | None | Medical and scientific imagery, historical health |
| Princeton Art Museum | None | Greek/Roman ceramics, pre-Columbian |
| Wikimedia Commons | None | 100M+ files — everything |
| National Archives (NARA) | None | US government records, military, maps, immigration |
| Library of Congress | None | FSA photos (Lange, Evans), maps, prints, manuscripts |
| NASA | None | Space photography, astronomy, missions |
| Smithsonian Institution | Free key | 21 museums: American history, natural history, air & space, African American |
| Europeana | Free key | 50M+ items from European museums and libraries |
| Rijksmuseum | Free key | Dutch masters — Vermeer, Rembrandt, 800,000+ objects |
| Harvard Art Museums | Free key | 250,000 objects: ancient, Asian, European, American |
| DPLA | Free key | Aggregates thousands of US libraries and archives |
"Creative Commons" images have very sketchy origins. Museums have stolen many things from many people. Open access solves a legal problem – not a consent problem. Many museums and archives hold material that was taken under colonial conditions, cataloged under Western property law, and later released as "public domain" or CC0 by institutions that were never the rightful moral authorities over it in the first place, if anybody ever did.
That means a file can be legally open while still carrying living cultural tensions: sacred restrictions, seasonal restrictions, gendered restrictions, lineage restrictions, or community-use limits. A good example would be images of sacred religious figures intended only for internal use by a particular culture's wise men.
This tool only harvests from sources that have already chosen to publish material under open-access terms. It does not harvest around community protocol systems or treat legal openness as proof of ethical permission. That line matters. This tool intentionally does not harvest from platforms that use community consent protocols because those communities have chosen not to default to open access, and that choice matters.
I don't call generated images "art" – we can debate that all day. Unless they involved extremely complex decisions or coding, I call them "decoration." I do use them when they assist accessibility or provocation toward positive social goals.
For example, I used DALL-E to create many complex images of animals with prompts extending beyond 30 pages each, in order to make George Orwell's Animal Farm more accessible to kids. Then I offer that as a product to educators and teachers.
To me, that is one acceptable use that becomes a kind of artistry and outweighs costs of "scraping." However, I also create an Open version of everything simply using stuff the community makes with pen and paper or creative commons images.
I am not anti-AI – I built this with AI for the social good. Yes, you can use AI for good things without loving its tech-bro billionaire corporate culture.
Unscraped Art is a search interface. It queries public APIs, returns public metadata, and displays links to publicly hosted images. It does not host, store, proxy, or redistribute any image or non-public data. Users download images directly from source institutions using their own connection.
It sends your query to the public APIs of up to 15 museums, archives, and libraries – the Met, Art Institute of Chicago, Library of Congress, Rijksmuseum, Europeana, and others – and displays what they return. We do not host images, and we do not own or control any of the content you find here.
Every image carries the rights information provided by the source institution. A label of "CC0" or "Public Domain" reflects that institution's declaration, not ours. We do not verify or guarantee rights status. Before using any image commercially, in print, or in published work, check the original source record and consult the institution's terms directly. The presence of an image in Unscraped Art does not mean it is cleared for every use. Images labeled CC BY or CC BY-SA require attribution to the originating institution – your Excel sheet will show the required credit line; it is your responsibility to use it.
We do not collect names, emails, or accounts. Your API keys are stored only in your browser (localStorage) and are never transmitted to our servers. Your search terms pass through our server to route queries to institution APIs. We do not store them, though our hosting provider may retain standard request logs (IP, timestamp) per its default policy.
The Unscraped Art codebase is open source under the Apache 2.0 License. The website text is licensed CC BY 4.0. Images and metadata belong to their source institutions and are governed by their respective licenses.
This tool does not harvest from community-governed repositories. Legal openness is not the same as ethical consent. If your institution wants to be removed or change how we interact with your API, contact us and we will respond within 30 days.