Skip to content

Treatise · AI and Creative Arts

When machines began to paint, compose, write poems, and edit videos, the line "creativity is humanity's last fortress" first showed its fragility. From 1973, when Cohen first set a computer to draw, to 2022, when diffusion models swept the world, to the explosion of video and music generation in 2024 — the relationship between art and AI has moved from tool, to collaborator, toward a word every creator now eyes warily: replacement.

Act I: The Prehistory of Machine Painting (1965–2014)

AI entered art far earlier than most people imagine.

In 1965, the German scholars Frieder Nake and Georg Nees exhibited the first geometric, computer-generated abstract works in Stuttgart. The same year, New York's Howard Wise Gallery held Computer-Generated Pictures. Early algorithmic art was led by mathematicians and engineers — they wrote rules, plotters drew on paper. The works were cold, geometric, mechanical.

In 1973, the British artist Harold Cohen launched a project that would shape his second life: AARON. Written in LISP, the program could generate lines, compositions, and color fills on its own. Unlike algorithms that merely executed orders, AARON carried "knowledge of painting" Cohen had encoded — what is an object, what is occlusion, what is compositional balance. AARON's works were exhibited at Tate London, SFMOMA, and the Centre Pompidou — the first AI works in major museum collections. Cohen worked with AARON until his death in 2016, never calling it a rival, but "a tireless collaborator."

In the 1990s, Karl Sims's Genetic Images (1993) let viewers vote for "more beautiful" descendants, breeding images by genetic algorithm. In the late 2000s, Casey Reas and Ben Fry's Processing brought algorithmic art into the open-source mainstream. But "AI art" through this stage remained the private speech of programmers, only loosely connected to the art market.

The fuse waited for an ignition.

Act II: GAN and the First Auction (2014–2018)

In 2014, Ian Goodfellow sketched the first draft of the Generative Adversarial Network (GAN) in a Montreal bar. A generator and a discriminator pitted against one another, converging through a game on the true data distribution — a simple setup that opened a new chapter in image generation.

After GAN, in 2015, Leon Gatys et al. published A Neural Algorithm of Artistic Style, and the "photographs repainted in Van Gogh's style" went viral. Neural Style Transfer was famous overnight. From 2017, BigGAN and StyleGAN took resolution and diversity to new heights. The faces generated by StyleGAN were indistinguishable from real, spawning phenomenon sites like thispersondoesnotexist.com.

What truly pushed "AI art" into the mainstream was an auction on October 25, 2018. The French collective Obvious had used GAN to generate a portrait in classical style — Edmond de Belamy. Christie's listed it in New York at $7,000 to $10,000. The hammer fell at $432,500, more than forty times the estimate. The signature in the lower right was a mathematical formula — the GAN loss function.

The art market had paid, for the first time, for an algorithmically generated image. Controversy followed quickly: the code drew heavily from Robbie Barrat's open-source GitHub project, and Obvious's own contribution was widely questioned. The auction was both a triumph for AI art and a warning that "authorship" would become the central question of every later debate.

Act III: The Summer of Diffusion Models (2022)

In January 2021, OpenAI released DALL·E. It generated images from text prompts — "an avocado in a tutu walking a dog." The frivolous demo gave the world its first concrete sense of what text-to-image could feel like.

But the real explosion came in 2022.

  • April: OpenAI released DALL·E 2; the leap in quality and composition made it a social-network phenomenon.
  • July: Midjourney entered open beta in a Discord channel. Its cinematic, oil-painting aesthetic drew illustrators and concept designers; the company turned a profit within six months of beta.
  • August: Stability AI released Stable Diffusion 1.4, based on the Latent Diffusion Model proposed by Robin Rombach et al. at the CompVis lab — both weights and code fully open-sourced.

The open-sourcing of Stable Diffusion was the most seismic event of this revolution. With a 4 GB weight file, anyone could generate images on a home GPU. Around it, in months, communities like HuggingFace, Civitai, and AUTOMATIC1111 built an entire stack — fine-tuning (DreamBooth, LoRA), control (ControlNet, open-sourced by Lvmin Zhang in February 2023), composition (ComfyUI). "AI painting" turned from a paid cloud service into an open-source craft anyone could modify.

In late August 2022, the digital art category at the Colorado State Fair awarded first prize to Jason Allen's Théâtre D'opéra Spatial — generated by Midjourney, finished in Photoshop and Gigapixel. The rules did not forbid AI; the controversy swept the art world. It was the first time an AI work won a traditional art competition.

Act IV: From Stills to Motion (2023–2024)

After the image fell, the next mountain was video.

In February 2023, Runway released Gen-1, letting users style videos with text and reference images. In June it released Gen-2, formally entering "text-to-video." But for most of that year AI videos remained "moving images" — low resolution, weak character consistency, broken physics.

The turn came in February 2024. OpenAI released Sora and showed minute-long sample clips with camera direction and continuous action — a woman in red walking through neon-drenched Tokyo, cherry blossoms drifting in the wind. Video generation at last let people believe long takes, complex physics, and multi-character interaction were not unreachable. Sora was not opened to users that day, but it set the 2024 baseline.

Then came the chase:

  • June 2024: Kuaishou released Kling, China's first publicly available large-scale video generator; its physical consistency and cinematic language briefly drew comparisons to Sora.
  • May and December 2024: Google released Veo and Veo 2; Meta released the Movie Gen series; Runway shipped Gen-3 Alpha.
  • 2025 onward: Kling 1.6, Sora 2, Hunyuan, ByteDance's PixelDance, Zhipu's CogVideoX — almost every month a new model devalued its predecessor in weeks.

The industrialization of video generation rapidly remade advertising, film previs, and social content. Coca-Cola's Christmas 2024 ad was AI-generated end-to-end, shaking the advertising trade. The share of AI short videos on TikTok and Instagram visibly rose week by week.

Act V: The Sound Revolution (2023–2024)

After text, image, and video, sound became the next battleground.

From 2022, Riffusion generated music by diffusing spectrograms and resynthesizing audio — an opening shot. By 2023–2024, three companies stood at the front:

  • Suno (Boston, founded 2022): released v2 in December 2023, letting users generate full songs with lyrics from a single sentence; v3 in March 2024 extended 30 seconds to 2 minutes; raised a $125 million Series B in May 2024.
  • Udio (London, ex-DeepMind team): open beta in April 2024, led by A16z, seen as Suno's strongest rival.
  • ElevenLabs (London, 2022): dominant in voice cloning and audiobooks.

The music industry's counterattack was swift. On June 24, 2024, the RIAA — representing Sony Music, Universal, and Warner — sued Suno and Udio for mass copying of copyrighted recordings for training. Both companies admitted in their replies that their training sets "include recordings obtained from the public internet," asserting fair use. The litigation continues, but it will draw the line for the entire generative-music industry.

Meanwhile, in 2025 Spotify acknowledged that millions of "AI-generated" tracks were on its platform, with some artists colluding to mass-produce AI songs and skim streaming royalties. Major rights agencies began demanding training-data transparency and artist opt-out.

Act VI: Writing Assistants and the Boundaries of Literature

After ChatGPT, written creation was the first deeply pulled into the AI tide — and the first to face backlash.

In February 2023, Amazon Kindle Direct Publishing was overrun by AI-generated children's books and short stories, forcing the platform to require AI-content disclosure. Several literary magazines (such as Clarkesworld) suspended unsolicited submissions because of the AI surge.

But inside professional writing, AI quietly became a collaborator. Tools such as Sudowrite, NovelCrafter, and Plottr were openly used by some web novelists. In 2024, Akutagawa Prize laureate Rie Kudan stated that around 5% of her award-winning The Tokyo Tower of Sympathy had been generated by ChatGPT. The disclosure ignited fierce debate: where is the line of collaboration? Should authorship be revised? Do readers have a right to know?

A deeper concern is stylistic homogenization. When countless writers polish prose with the same underlying model, will internet content slowly converge to one "AI standard register"? This is creative writing's distinct danger, different from images and video — chronic, subtle linguistic impoverishment.

Act VII: The Artists Strike Back (2023–2025)

The other side of the technological celebration was unprecedented anger inside the artist community.

In early 2023, Polish illustrator Greg Rutkowski found that his name had become one of the most popular Stable Diffusion prompts — more images had been generated in his style than he had painted in his entire life, signed with his name. His paintings were absorbed into LAION-5B without consent. He published a public statement calling for legal protection.

In January that year, three artists — Sarah Andersen, Kelly McKernan, Karla Ortiz — filed a class action against Stability AI, Midjourney, and DeviantArt. The same month, Getty Images sued Stability AI in both the UK and the US, alleging the use of 12 million Getty images (with identifiable watermarks) to train Stable Diffusion. Getty v. Stability opened in the High Court of Justice in London in 2025, the first image-generation copyright case to come to trial.

On the technical side, Professor Ben Zhao's team at the University of Chicago released two weapons:

  • Glaze (March 2023): imperceptible perturbations applied at the pixel level before publication, causing models to learn the wrong style features.
  • Nightshade (October 2023): a more aggressive "poisoning" tool — once images processed by it enter a training set, they shift the model systematically on the targeted concept.

Hundreds of thousands of artists downloaded Glaze and Nightshade. For the first time in history, a community struck by technology fought back with technology of its own. ArtStation and other platforms introduced "NoAI" tags in early 2023, letting authors declare their works off-limits to training.

On December 27, 2023, The New York Times sued OpenAI and Microsoft, alleging unauthorized use of millions of NYT articles to train the GPT series. The complaint included dozens of pages of "verbatim recitation" — under specific prompts, GPT-4 reproduced paywalled NYT articles almost word for word. It is one of the most consequential generative-AI copyright suits to date.

In 2024–2025, more cases gathered into a flood:

  • The Authors Guild and many writers sued OpenAI and Meta.
  • The Wall Street Journal and The New York Post sued Perplexity.
  • German and French publishers sought royalties under the EU Digital Single Market Directive.
  • China's "first AI-generated image case" reached judgment in November 2023; the Beijing Internet Court held that, under specific conditions, an image generated by Stable Diffusion could enjoy copyright — though the dispute is far from settled.

In legislation, the EU AI Act came into force in August 2024, requiring general-purpose AI models to disclose summaries of training data for the first time. From 2024, several U.S. state-level acts — NO FAKES Act, ELVIS Act — advanced against AI cloning of celebrity voices and likenesses.

Technology, law, artists, platforms, giants — the five-way contest is rewriting the relationship between "AI and creativity." It is no longer a purely aesthetic question; it is a fundamental issue of labor, property, and cultural memory. Machines have not replaced artists, but they have permanently altered the artists' situation.


Historian's Note

The encounter of art and AI is older than most realize: more than half a century has passed since Cohen wrote AARON. For the first forty years AI was a strange brush in the artist's study, programmable painting the private speech of a few. In the last decade diffusion models and large models burst on the scene; machine painting, machine composition, machine novel-writing strode from the lab into the street. The auction of Edmond de Belamy, the Tokyo street view of Sora, the five-minute pop song of Suno — what stunned the world was never the technical ceiling, but the speed of democratization. Beneath the celebration ran undercurrents: illustrators saw a lifetime's style summoned by three English words; voice actors heard their own voices lent to others' lines; publishers found their archives inside training sets. Anti-AI tools like Glaze and Nightshade emerged in answer — for the first time in history, those struck by technology answered the flood with their own. Getty v. Stability, NYT v. OpenAI, RIAA v. Suno — three cross-border trials will shape the order of creation for a decade. Whether a machine can "create" is a philosophical question to be left for posterity. The pressing matter is more practical: under what conditions does machine "borrowing" not become plunder, under what contracts can creators collaborate with machines without becoming components. AI will not let creativity die, but it has shaken the meaning of "author" more violently than any moment since the Renaissance.

Eyewitness Accounts

Call for contributions

If you are an artist, musician, writer, or designer affected by AI, please contribute on GitHub and share your experience.

References

  1. McCorduck, P. (1991). AARON's Code: Meta-Art, Artificial Intelligence, and the Work of Harold Cohen. W. H. Freeman.
  2. Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27.
  3. Gatys, L. A., Ecker, A. S., & Bethge, M. (2016). Image style transfer using convolutional neural networks. Proceedings of CVPR 2016, 2414-2423.
  4. Christie's (2018). Is artificial intelligence set to become art's next medium? (Edmond de Belamy auction record).
  5. Ramesh, A., Pavlov, M., Goh, G., et al. (2021). Zero-shot text-to-image generation (DALL·E). Proceedings of ICML 2021.
  6. Rombach, R., Blattmann, A., Lorenz, D., et al. (2022). High-resolution image synthesis with latent diffusion models. Proceedings of CVPR 2022, 10684-10695.
  7. Roose, K. (2022, September 2). An A.I.-generated picture won an art prize. Artists aren't happy. The New York Times.
  8. Shan, S., Cryan, J., Wenger, E., et al. (2023). Glaze: Protecting artists from style mimicry by text-to-image models. Proceedings of USENIX Security 2023.
  9. Shan, S., Ding, W., Passananti, J., et al. (2023). Prompt-specific poisoning attacks on text-to-image generative models (Nightshade). arXiv preprint arXiv:2310.13828.
  10. The New York Times Company v. Microsoft Corporation, OpenAI, et al. (2023, December 27). U.S. District Court, Southern District of New York, Case No. 1:23-cv-11195.
  11. Andersen et al. v. Stability AI Ltd. et al. (2023, January 13). U.S. District Court, Northern District of California.
  12. Getty Images v. Stability AI (2023, filed). High Court of Justice (UK) and U.S. District Court, District of Delaware.
  13. RIAA (2024, June 24). Major music companies sue Suno and Udio for copyright infringement.
  14. OpenAI (2024, February 15). Sora: Creating video from text. OpenAI Research Blog.

Released under CC-BY-SA 4.0