What is the Reporting API?

The MX Reporting API enables a partner to track changes for all data held on the MX platform for their clients without having to read this data individually for each user. This is done by providing daily change files which indicate how objects have changed throughout the day.

This guide provides best practices on how partners should consume and stage the data within their own system. It should be read together with the with the Reporting API technical reference.


Accessing your files

Files are generated daily and are available for up to seven days after generation. It is the partner’s responsibility to download them before they are no longer available. For more information, refer to the availability of files section of the technical reference.

The Reporting API does not provide a full snapshot of historical data. If historical data is desired, MX can generate a one-time data snapshot for an additional cost.


Consuming daily reporting files

Objects on the MX platform are organized in a hierarchy. This means that your systems must consume daily reporting files in a particular order so that the objects are created, updated, and deleted in the proper order in your data store/warehouse.

First, you must consume all files with create actions in this order:

  1. Users
  2. Members
  3. Accounts
  4. Transactions
  5. Holdings
  6. Categories
  7. Tags
  8. Taggings
  9. Goals
  10. Budgets
  11. Notification Profiles
  12. Beat
  13. Beat Feedback
  14. Devices
  15. Analytics Events
  16. Analytics Page Views
  17. Analytics Screen Views
  18. Analytics Timed Events

Second, you must consume all update files in the same order as above.

Third, you must consume all delete files in the reverse order. This ensures transactions are deleted on your systems before the account they belong to is deleted, and so forth.


Generating sample data for the integrations environment

Files in the Reporting API are generated from system events which represent user activity and aggregated account and transactional data. This presents a challenge in the integrations environment because there are no users creating activity on the system.

Partners who need a more robust set of test data can add their own by creating internal test users that add accounts and use the system for a few days to generate log events.

The following steps give an idea of how this could work.

  1. Create test users in your integration client using whichever MX API you use for this purpose, e.g., the Platform API or MDX v5 Real Time.
  2. Generate master_widget URLs for those users with the appropriate API, e.g., the Platform API the the SSO API.
  3. Copy the URL from the master_widget response and paste it in a browser window.
  4. Have your test user(s) use the system. Some recommended actions are:
    • Add savings, checking, loans, and investment accounts;
    • Categorize transactions;
    • Add tags to transactions;
    • Create custom categories;
    • Create and update goals and budgets;
  5. The next day, new files containing the event logs of the actions performed will become available to download via the download daily files endpoints.

Dealing with large files through byte serving

Avro files can become very large (multiple gigabytes) which can result in partial downloads. This can be resolved by using byte serving which allows you to request data in a set of ranged chunks that can later be assembled into the full raw Avro file response.

1. Download file segments using the curl command

The command line tool curl can be used to download HTTP ranges by specifying the -r or --range option. This example shows a scenario where the Avro file is larger than 1GB. The first curl command specifies the range for the first gigabyte (0-1073741823) and the second command specifies the range for the rest of the data (1073741824-).

Downloading the first part of a file:

1
2
3
curl -X GET -r 0-1073741823 https://int-logs.moneydesktop.com/download/{client_id}/2019-10-07/transactions/created -o 20191007-transactions-created.avro.part1 \
  -H 'Accept: application/vnd.mx.logs.v1+avro' \
  -H 'MD-API-KEY: {api_key}'

Downloading the second part of a file

1
2
3
curl -X GET -r 1073741824- https://int-logs.moneydesktop.com/download/{client_id}/2019-10-07/transactions/created -o 20191007-transactions-created.avro.part2 \
  -H 'Accept: application/vnd.mx.logs.v1+avro' \
  -H 'MD-API-KEY: {api_key}'

2. Assemble segments back into a single file

The next step is to assemble the two file partials into a single file. The cat command cat input1 input2 > output can be used for this purpose. In this case input1 and input2 are the file segments downloaded in the previous step. The output file will be a complete Avro file.

Example

cat 20191007-transactions-created.avro.part1 20191007-transactions-created.avro.part2 > 20191007-transactions-created.avro

Object history and revision numbers

Data returned in the Avro files represents a history of changes made to objects on the MX platform throughout the day. This means it is possible for a single daily file to have multiple entries for the same object. In order to enable partners to know the sequence of events, the fields revision and udpated_at are included on each resource entry. Partner’s should use the entry with the latest revision to ensure that the latest version of the object record is consumed.


Parsing Avro files

Avro files are built in a way that they can be parsed and serialized easily into other formats. Avro’s documentation provides guidance on parsing these files using different methods.

Below we show how to read an Avro file into a Ruby script and parse the output to JSON and CSV. There is an Avro gem available from rubygems.org which we will use to parse the Avro file.

Avro to JSON

1
2
3
4
5
6
7
8
9
10
11
require 'avro'
require 'json'

json_array = []
avro_file_path = "some_avro_file.avro"

Avro::DataFile.open(avro_file_path, "r") do |reader|
	reader.each do |row|
       json_array << row.to_json
  	end
end

Avro to CSV

1
2
3
4
5
6
7
8
9
10
11
12
13
14
require 'avro'
require 'csv'

result_file_path = "some_csv_file.csv"
avro_file_path = "some_avro_file.avro"

Avro::DataFile.open(avro_file_path, "r") do |reader|
	CSV.open(result_file_path, "a+") do |csv|
	    reader.each_with_index do |row, index|
	       csv << row.keys if index == 0
		   csv << row.values
	    end
	end
end

Sample files

Below are some sample Avro files generated from the Reporting API.


Schema evolution

MX may add additional fields to the writer schemas included in the response files at any time. When new fields are added, this is done with backwards compatibility in mind and the order of the fields may change. This follows the standard set forth by Avro’s schema resolution documentation.

A reader of Avro data, whether from an RPC or a file, can always parse that data because its schema is provided. But that schema may not be exactly the schema that was expected. For example, if the data was written with a different version of the software than it is read, then records may have had fields added or removed.

We call the schema used to write the data the writer’s schema, and the schema that the application expects the reader’s schema. Differences between these should be resolved according to the details outlined in the schema resolution section of Avro’s official documentation.

In order to support some use cases, MX does include default values for all of our fields.