Data Snapshots

Important Note: For now, snapshots are generated intermittently. We'll update this message when the snapshots are back on a regular schedule.

If you want to work with a lot of Kiva's data, making hundreds or thousands of requests to the Kiva API can be overbearing — both for you and your network. As such, we make much of our data available through snapshots which are compressed into a simple singular download. The data is archived nightly so it is most useful for apps that don't require live data, such as data analyses and visuals. However, some applications might find these handy as a way to seed local data sources, supplementing the snapshots with calls to the Kiva API for the most recent data.

A data snapshot is composed of three data files, delivered in a single compressed ZIP archive. Data snapshots are available in CSV and JSON formats. For the most part, the format of the JSON snapshots are the same as an API response, with a few exceptions.

Downloading Snapshots

The latest data snapshots are available in the format of your choice at the following URLs:

JSON: http://s3.kiva.org/snapshots/kiva_ds_json.zip
CSV: http://s3.kiva.org/snapshots/kiva_ds_csv.zip

Archive Structure

When you decompress and extract a data snapshot you'll have a collection of documents with the following structure (CSV files will have the .csv file format):

kiva_ds_json/
	lenders.json
	loans.json
	loans_lenders.json

File Format

JSON snapshot

As mentioned, the format of the JSON snapshot is very similar to the API format. The main difference is the loans_lenders mapping file, which maps between the loans and lenders files. The format of the loans_lenders.json might look like this:

{ "loan_id": "558112",
  "lender_ids: ["muc888", "tristan7990", "shivaun4955", "sam44598568", "mike4896",
	"catherine7003", "summer7416", "jim8842", "brett5260", "roger2252", "jolanda1942",
	"anila7468", "elizabeth31676552"] }

Where the loan_id field references the id of the loan and the lender_ids are the ids of the lenders to that loan.

CSV snapshot

The CSV snapshot is a direct translation of the JSON snapshot. Fields that are reprented as arrays in the JSON snapshot are translated to comma separated lists. For instance, consider the following record in loans.json:

{ "borrowers": [ {"name": "Bunsuor", "gender": "male", "pictured": true},
		{"name": "Chamroen", "gender": "male", "pictured": true}]
 }

In the loans.csv snapshot, this would be split into three separate fields, each having its data separated by a comma and a space:

borrower_names,borrower_genders,borrower_pictured
"Bunsuor, Chamroen","male, male","true, true"