Data Library now requires sign in

For the past year, our servers have often been overhelmed by ill-behaved crawlers. You may have noticed pages not loading, broken images, or empty download files. Attempts to counter this threat with less disruptive tactics have been unsuccessful, so we are no longer allowing anonymous access to the web site. All users must sign in. If you don't already have an account, you can create one for free.

Why is it a problem now?

With the advent of AI, an increasing number of players are scraping the world wide web to feed their models. Too many of them are not respecting the rules of engagement, calling for drastic measures such as authentication.

How does this affect me?

Dataset web pages now require authentication. Create an account if you don't already have one, and sign in. Once signed in, the rest of your experience remains unchanged. Private datesets that require authentication will behave as before: if you have permission, you will go through, if not, you will be presented the Terms and Conditions to access that specific dataset.

Only the web interface is affected. Download requests (e.g. netcdf, tiff, tsv, OPeNDAP/DODS) do not require authentication, so most scripts that download data automatically should not need to be changed. If you find that your script has stopped working, please contact help@iri.columbia.edu. Include the download URL that failed and the error message that was returned.

Are more changes coming our way?

Most likely yes. We may implement further changes to improve the user experience while continuing to keep crawlers at bay. Some of those upcoming changes may be transparent to you, others will be notified as appropriate.

Thank you for your understanding. We hope that this change will greatly improve the service. Contact help@iri.columbia.edu with any questions, and happy DL navigation!