The story so far: The Internet Archive, a non-profit organization that aims to digitize, preserve, lend, and share multimedia content, is facing major legal challenges as it faces traditional publishers who accuse it of copyright infringement. The free digital library is currently fighting the forced removal of about half a million books from its platform, which it says functions like a library.
What is the case against the Internet Archive?
While many of the books digitized and uploaded by the Internet Archive are already in the public domain – such as historical sources, old classics, etc. – many traditional publishers claim that the Internet Archive violates copyright and illegally makes these books available to the public. also, by scanning physical copies and distributing digital files.
In the Hachette vs. Internet Archive case that began in 2020, traditional publishers Hachette, HarperCollins, Wiley, and Penguin Random House sued the Internet Archive. On March 24 of last year, District Judge John G. Koeltl issued an order in favor of the publishers.
“The IA website includes millions of public domain ebooks that can be downloaded for free and read without restriction,” the order said, adding, “Relevant to this action, however, the website also includes 3.6 million books that are legally protected by copyright, including 33,000 Publisher titles and all Works in Suit.”
(For today’s top tech news, subscribe to Cache’s current tech newsletter)
In particular, traditional publishers are against the IA’s temporary ‘National Emergency Library’ (NEL) initiative launched during the COVID-19 pandemic. This is to allow more users to access the e-books in the collection when the physical library is locked.
“During the NEL, the IA lifted technical controls that enforced a one-to-one-to-borrow ratio and allowed up to ten thousand customers at a time to borrow each ebook on the website,” said the 2023 order.
In general, IA uses a system known as “controlled digital lending” to limit the number of people who can access e-books. It ended the emergency library system after being hit with lawsuits.
The Internet Archive used the fair use doctrine to defend itself in that case, but it didn’t hold up. The organization said it would appeal, but did so after some delays.
The case is still ongoing, with the oral argument stage of the appeal taking place on June 28.
Why are books removed from the Internet Archive?
As a result of the lawsuit, IA was forced to remove over half a million books from its database, with the Director of Library Services at the Internet Archive, Chris Freeland, calling out “a very negative impact” on users.
According to the testimonies collected by the IA, the mass deletion resulted in students not being able to access the books for academic research.
When IA identifies itself as a library, it has been compared to a shadow library or a piracy database by traditional publishers, who do not agree with the “controlled digital lending” approach.
Although it has been removed, the Internet Archive is still home to a rich collection.
As of the end of June, the web archive said it contained 835 billion web pages, 44 million books and texts, 15 million audio recordings, 10.6 million videos, 4.8 million images, and 1 million software programs. Live concerts and television programs are also part of this collection.
What is the Wayback Engine?
While the Internet Archive buys physical books, digitizes them, lends them to users, or makes them available for download, since 1996 it has also focused on maintaining web pages. The platform claims users can browse more than 866 billion web pages stored through its own search service.
“We started in 1996 by archiving the Internet itself, a medium that was just beginning to be used. Like newspapers, the content published on the web is not always clear – but unlike newspapers, nobody keeps it. Today we have 28+ years of web history accessible through the Wayback Machine and we work with 1,200+ libraries and other partners through the Archive-It program to identify important web pages,” the Internet Archive says on its website.
Users can help IA archive parts of the internet free of charge, or they can approach the platform to make their own work available to the public.
How can I use the Wayback Engine?
Using the Wayback Engine is easy and free, although results are not always guaranteed.
To get started, navigate to the Wayback Machine web page, where you will see a bar where you can enter the URL / keyword corresponding to the web page or content you are looking for. Then, press ‘enter’ and wait for the result to be displayed.
If the content is new, rarely viewed, or deleted long ago before being archived, you may not get many results or anything.
However, you have a good chance of finding content such as old websites that no longer exist, previous versions of existing websites, deleted social media posts, archived versions of paywalled articles, and archived versions of blocked or censored content on your jurisdiction. .
The graph will show you how many times the Internet Archive “crawled” content in the past month or year, allowing you to click on the calendar bubble to select “snapshots” of web content from different time periods. However, the service is intermittent and not all content can be stored perfectly; Broken links, missing media, or pages that won’t load are often the end result.
While the Wayback Machine is useful for personal research or for accessing information sources, users should be cautious about relying on data obtained through these sources, as the information stored may sometimes be outdated or inaccurate.