How long will data be stored in CaltechDATA?

In most cases, indefinitely.  Any data that violate the Terms of Deposit or which fails to meet minimum standards for retention may be withdrawn. For example, we may eventually remove deposits that consist of unusable/obsolete files or are inadequately described. Files that are larger in size may have higher standards for retention.  The DOI for all records will be retained and will lead to a tombstone page listing the reason for withdrawal.  

Who runs CaltechDATA?

CaltechDATA is a service of the Caltech library as part of CODA (http://libguides.caltech.edu/CODA).  All members of the Caltech community can upload research data for long term preservation and public access.

 

Who can upload files to CaltechDATA?

Any member of the Caltech community with an access.caltech username can upload files to CaltechDATA.  For issues with your username or password, please contact IMSS.

 

What can I upload to CaltechDATA?

You can upload any digital files to CaltechDATA.  You can also directly import software from Github for long term preservation.  Publications should be submitted to CaltechAUTHORS

 

Is there anything that can't be deposited to CatechDATA?

You must have the rights to any data you deposit.  Data cannot violate the publicity, privacy or confidentiality rights of others or be covered by HIPPA or FERPA.  Read our Terms of Deposit for complete details.

 

How much data can I store on CaltechDATA?

There are currently no hard storage limits, but you should only deposit data that will be useful to others.  All data must be described in sufficient detail so that others can understand it.  If you're planning on uploading more that 500 GB of data, please data [at] caltech.edu (subject: Uploading%20Large%20Amounts%20of%20Data%20to%20CaltechDATA) (contact us) first.

 

Does CaltechDATA have any file size restrictions?

No, but file size can impact the availability of files.  Files under 1GB will always be immediately available.  Files under 100GB may be immediately available only if they are accessed every four months.  Otherwise files may be stored on Infrequent Access Storage (IAS) and may take up to 24 hours to retrieve.

 

How do I access my CaltechDATA files in Infrequent Access Storage(IAS)?

Currently no files are stored on IAS.  When this feature becomes active, instructions will be provided.

 

Can I make my files in CaltechDATA private?

Not indefinitely.  You can embargo data for a specific period of time, but all data must be intended for public access at some point in the future.

 

Can I restrict data in CaltechDATA to specific users or groups?

No, CaltechDATA is an open repository.

 

Does my data need to be published before uploading to CaltechDATA?

No, you can upload unpublished data.  If you wish, you can embargo the data until publication.  You can easily link your data to publications by entering the DOI in the related publications field.

 

What metadata fields are used in CaltechDATA?

Our metadata is derived from the DataCite 4.0 schema, and is compliant with the Project Open Data 1.1 schema (used by US Federal Government Agencies).  Our metadata includes an explicit related publications list.

What can I do with data in CaltechDATA?

You can use the read API to load files in CaltechDATA into another application.  For example, the web application at plots.caltechlibrary.org  allows you to interactively plot two mineral spectra files (https://data.caltech.edu/records/208, https://data.caltech.edu/records/209).  You can see the api code that generates this demo at https://github.com/caltechlibrary/caltechdata_plot.  Feel free to send us an email at data [at] caltech.edu if you'd like help integrating the repository with your application. 

Is there any charge for storing data in CaltechDATA?

CaltechDATA storage is provided by the library at no cost in most cases. Users planning on depositing more than 500 GB of data should email us at data [at] caltech.edu to discuss your requirements.

Can I make changes to a record in CaltechDATA?

Yes, once you're logged in you should see an edit button appear for all records you created.  This allows you to edit the metadata in the record.  If you need to change the files associated with a record, send us an email at data [at] caltech.edu.

How do I use the CaltechDATA API to create records?

First, you need to generate an access token.  Log into CaltechDATA, and then click on your user menu (the person icon in the upper right hand corner). Then click "Applications".

Menu Option

Click on the "+ New Token" button in the Personal access tokens section.

Token button example

Make up a name for your token and check all of the scope buttons.

Token details

Your token will be shown on screen.  Copy it down and store it somewhere secure.  It functions just like an account password.  

You can create records using our python library caltechdata_api.  You can install the library by downloading the source code of the latest release, extracting the file, and navigating to the caltechdata_api-x.x.x directory using the command line.  Then type 'python setup.py install' to install the library.

To use the library, you'll need to set the access token you just created.  Type 'export TINDTOK=TOKEN', where TOKEN is replaced by your actual token - or use the token.bash script that is distributed with the library.

Some scripts used for creating more complex data records are located in the caltechdata_migrate repository.  An example that published a mercurial repository to CaltechDATA is available at caltechdata_hg.

We're also here to help - just send us an email at data [at] caltech.edu