But I think this will hurt them as time goes on more then help. IIRC, one news org blocked free access and their revenue fell. I think that was in Australia.
But seems they are using AI as the reason. So allowing after a week will not avoid AI access.
But, what happens of an AI Company subscribes to the news site using a person's name (or a fake name) ? They will still get the article and avoid hassles.
One of the tests for Fair Use in the US, as I understand it, would be whether the archived work "competes" with the original.
If people start going to IA instead to read the news, the newspaper might have a claim. But if they're doing it to get around paywalls, or purely for archival/historical/research purposes, that may be allowed.
But the reality is such decisions are subjective and will be up to whatever judge happens to get such a case in front of them if this is challenged.
People would equally reject Netflix, if Netflix fooated the idea of replacing the subscription model with pay-per-view micropayments.
> ...nobody proposes an alternative solution
Such is the human condition - some problems simply have no satisfactory solutions.
You sure about that?
Something like over half of Netflix viewers believe their subscription isn't justified by how much they watch or else they aren't sure of it. Less than half believe the subscription cost is justified.
Whether a PPV model would actually be cheaper for the first half is a good question, but it is possible. Certainly, in my case, I do not watch $20 worth of content on Netflix a month. I would gladly take PPV.
I would never watch 6 movies in a month, and the selection with Netflix sucks by comparison to what is available when renting.
The subscription services are only a good deal if you binge them
Why does it work like that anyway? Every time I open a page on some sites, their vexing box shows up to waste my time. Five minutes later I want a different page on the same site and it does the same thing. They can't do it once and cache the result?
I do worry about their whitepaper recommending it for a CBDC[2] (linked from [3]) which points out the state can implement negative interest rates, and that its architecture requires the issuer to get involved even in "spot your friend a $20"-level use cases. Since the issuer would presumably be required to KYC everyone, that also creates a big surveillance problem.
[1]: https://www.taler.net/en/index.html
[2]: https://www.snb.ch/public/asset/de/www-snb-ch/publications/r...
A pay wall at the news site would just bankrupt the internet archive, and a pay wall at the internet archive will kill most public interest in the service.
This trend of outright banning the Internet Archive has me extremely worried. I fear a future where news articles are memoryholed, and no one can remember exactly what was reported and how sensational it all seemed.
I've been working on this project [0] for a while. Originally, I started with a tool that would allow people to snapshot webpages in their own browser, and they could selectively share their snapshots. Then by consensus, everyone could understand what exactly had changed, and they could draw their own conclusion about why.
While working on it, I realized that an authoritative answer to "what did it look like on $DATE" can't be produced by a no-name company. It's gotta be a non-commercial entity that's got a track record of integrity. The dream would be to allow MemoryHole customers to submit their snapshots to the Internet Archive (or other non-commercial entity). It's definitely a copyright nightmare - so no clue how this could work.
[0] - https://memoryhole.app
It could work as a decentralized free and open source system that doesn't care about copyright. Like how torrents work now, but it would be good to have it work over Tor or something. Perhaps as a DAO for the management aspect of it. I don't know how exactly. But disregarding copyright by using a centralized company is the wrong idea.
Or you can do the lawful approach and try to work within the framework of that copyright nightmare. But "fuck copyright" is an easier path.
The torrent approach is nice. I could imagine a selfhosted way to store the data (for a group of people)
Linkwarden does this well. You can share a collection for a small group of people.
Tor is fine especially for onion sites. You just have to understand the limitations.
(I2P is also good.)
Is there a way to export/download my saves in a reasonable way?
It looks like this:
├── files
│ └── 632daffb-2f4f-4795-bb4d-3149d24f4264
│ ├── original.html
│ ├── readerview.html
│ └── screenshot.png
├── manifest.json
└── metadata.csv
The next natural thing to happen would be privatization or consolidation of the internet itself. Its already happening in the form of grabbing and consolidating IPv4 addresses.
Blocking archiving in a flailing attempt to keep AIs away is extremely shortsighted. Archiving is important for keeping historical context, especially when it comes to news and journalism.
One possible solution that I can think of for the long term good could be to just allow archival, no retrieval of the latest information, at-least for 6 months or a year. This should theoretically allow most goals.
It is not hard to imagine a future in 50 years time where a huge percentage of this content is lost forever, or at best incredibly hard to find.
Similarly and tangentially, when the US Constitution was made in an era of horseback/carriages, it explicitly authorized the creation of a public national postal service (USPS).
If we extended that older public policy with today's technological context, they would have authorized a national Internet Service Provider. (And, like with USPS, specialized private competitors would exist.)
(It is worth noting that at least in Sweden "published" here has a very specific meaning, that doesn't include personal websites etc, but it does include news outlets.)
My own last project before I left was to ingest records from crawl dumps from the defunct cuil.com website. About 200 TB of stuff that brought back 60 billion URLs.
The nature of the internet has changed and it's become an ephemeral place for many people where you just through things in and others mine it as "data".
I'm sure that plays a role, but still... This obviously is about cost and money making, not security as a whole (ime)
It's more the case when the addresses and birthdates of public figures, which are often a matter of public record, enter the picture but it's easier to find out information about a lot of people with a bit of data than most people realize if anyone really cares to investigate.
Redditors then had the gall to pretend like it wasn’t their number one use case.
Back issues (usually at least a few years old) are available via JSTOR for free in small amounts and through subscriptions for bulk users. I'm sure there's some reason to fight about the details, but from a distance it looks like a pretty good compromise.
Obviously, a business needs to have an income but it's becoming more common for businesses to function first and foremast as revenue generators and the thing that enables that is only seen as a means to an end. When the quality of the product/service and it's function as a revenue generator diverge, the product/service will always take 2nd chair.
Maybe we could argue that the primary product is the revenue, especially when there are investors involved who are looking for big returns.
https://www.uh.edu/news-events/stories/052815watchingtvracia...
https://www.mediamatters.org/legacy/video-what-happens-when-...
Historically-speaking, if your local news can twist the context to make you easier to sell to (products, services, ideologies), they will do that.
There is no media theory of information of what happens when info explodes beyond capacity of the system to consume it. (UN report on Attention Economy says less than 1% is actually consumed by humans)
So media orgs, instead of coming up with one, they just keep mindlessly doing what they know how to do - generate more info. Platforms and corps subsidize this activity for their own interests.
So media orgs have no signal/warped signals of how useless what they are doing is.
So their argument is that people who would be paying money at their paywalls, are going to IA to get their news for free? And if they can thwart those people, they'll show up and become monthly subscribers?
I am vaguely sympathetic to newspapers as a concept, though the actually owners of approximately all of them are just PE companies looking to extract maximum profit from this dying industry, not really trying to prolong their existence.
But I think everyone who is interested in subscribing to their newspapers' paywalls already has subscribed. Those of us who bypass paywalls with that archive.whatever site, or apparently IA (I have never tried it for this purpose) are doing so because there is zero chance we're going to (recurringly!) pay the asking price for some random out-of-town newspaper, The Verge, Bloomberg, whatever. It's fair game to call us immoral for that decision, but if (and it's a big if) this move prevents more people from being able to bypass a paywall, I predict zero incremental dollars will go to the news publishers.
Now most of those who spend money get access to relatively good news in comparison to those who don't. The interesting thing is that if you model the utility of a customer base as trifactorial (subscriptions, ad-supported, influence-ability) and you set ad-support to near zero you're left with this situation where those with no ability to pay are now overwhelmingly useful to the website provider only as an influenceable base.
"If you're not paying, you're not the customer, you're the product", we used to say[0]. It turns out that's true, but if you can't pay by looking at ads, you will pay by the actions you take when you believe what the actual customer wants you to believe.
0: Though sometimes you do pay and you're still "the product" haha!
For the later, archive could just limit access to stuff that's less than 7 days old.
I don't see why every news outlet doesn't just do this.
Spite? No evidence of that. They probably just don't want to lose the money from paying customers and ads. You're just making up fantasy. Perhaps projecting your own spite.
you listed
1. buying the cheapest groceries you can reasonably find 2. trying to get the highest salary you can 3. literally any time you try to get more for yourself
that's a weak list from which to conclude that greed isn't a problem, especially since in the case of 1. and 2. someone's making money off you, the person who's supposedly greedy in these scenarios.
In which case archive is a major revenue slumper
I'm pretty sure similar was done for the newspaper. However, the oldest paper was bought and killed decades ago, so not sure what happened there.
While not as convenient as a live website, most news sources will have an actual physical archive that you can access with some real intent.
2. And that if news sites offered the text for free but paywalled images they'd be more sustainable than they are now?
"Since the early 2000s, the U.S. has lost about 40% of its local newspapers and about 75% of the jobs in newspaper journalism, according to a 2025 report from the Medill School of Journalism at Northwestern University. A study published last year by Rebuild Local News and Muck Rack shows that in 2002, there were roughly 40 journalists per 100,000 people in the United States. Today, it’s down to about eight journalists."[0]
[0] https://theconversation.com/why-the-pittsburgh-post-gazettes...
The Internet Archive at least provides one solution there, especially given the somewhat dubious practices Archive.is/today seems to be up to at the moment.
But I suspect that's probably another reason these sites don't want their work archived.