1. Knowledge Base
  2. The Faraday app
  3. Your data: Connections and Datasets

How to create a dataset

How to transform your raw customer data into prediction-actionable data with Faraday

Table of Contents
  1. What is Datasets?

  2. Accessing Datasets

  3. Creating a new dataset

  4. Adding an identity set

  5. Adding events

  6. Adding traits

What is Datasets?

In Faraday, datasets are where you plug your customer data into Faraday, then organize it to make it usable for predictions. Here, you'll organize your data into identity sets, events, and traits that can be used throughout the Faraday platform. By plugging your data into Faraday, you're able to use it to target specific desirable outcomes that your customers take–like purchases–in leads.

Accessing Datasets

To access Datasets, select the Consoles menu on the left-hand navigation bar, then find the Datasets icon. If your screen size is large enough, the consoles menu many already be expanded.

Creating a dataset

Inside Datasets, you'll find a list of your current datasets if you have any, as well as columns for:

  • Source: the source type of this dataset, such as Hosted CSV.

  • Row Count: the number of rows in the dataset.

  • Identities: the number of unique identities, or people, in the dataset.

  • Events: the event type of the dataset, such as orders or churns.

To create a dataset:

  1. Select + New dataset in the upper right of the Datasets screen.

  2. Next, choose to create your data set from either a connection or Hosted CSV. The connections that populate here are pulled from the Connections console, which is where you can add new ones. Check out our Connections article collection for instructions on how to add new connections.

    1. Connection:

      1. To configure a BigQuery dataset, for example, ensure that you have a connection created to your BigQuery instance.

      2. Then, select BigQuery as the dataset type.

      3. Give your dataset a name.

      4. Specify the table that you'd like to use for the dataset.

      5. Click Finish.

    2. Hosted CSV:

      1. Give your dataset a name.

      2. Either drag your CSV to the file dropper, or click the file dropper to browse and select your CSV.

      3. Click Finish.

  3. Regardless of whether you chose a connection-based or CSV-based dataset, after clicking Finish, you'll receive a notification that your dataset has been created, and you'll be moved to the edit dataset view where you can customize it.

Adding an identity set

Identity sets are used to help Faraday identify people in your data. With this information, you can create cohorts of your customers (or anyone else identified in your data) and outcomes to predict things about these individuals.

  1. In the dataset's Definition (default) tab, click + Add identity set to get started, which will open the new identity set window.

  2. Give your identity set a name.

  3. Next, match the properties that exist in your data in the Field in dataset column with the Faraday property names in the left column.

    💡 Not all property fields are required, but email and address are the most useful for identifying people. The more fields you include, the more likely to match the people are.

     

  4. Once you're done matching your properties, click Finish to save the identity set. If you need to edit or delete the identity set at any point, click the three dots (...) on the right.

Adding events

Events show Faraday how to recognize recurring (but not always recurring) actions taking place in your data, such as purchases, renewals, click events, upsells, etc. Dates are often the most useful piece of data for events.

Event streams that you define in datasets are available for selection when creating cohorts, which are then used to create outcomes, which then go on to help build your predictive pipelines.

  1. In the dataset's Definition (default) tab, click + Add an event to get started, which will open the new event window.

  2. Next, choose whether to add the event to an existing event stream, or create a new one. Since we're starting from scratch here, we're going to create a new one.

    💡 Unsure which option to select? Generally, if the new dataset you're creating contains event data that a previously-made dataset also includes, such as order or churn dates, you'll want to add this event to that existing event stream to keep your data clean.

     

  3. In the following screen, give your event a name and match the properties that exist in your data in the Field in dataset column with the Faraday property names in the left column. In this dataset, we're selecting the date that a customer made their first purchase as the date timestamp property. Not all properties are required, but if a property is selected, the corresponding Format field is required.

  4. Once you're done matching your properties, click Finish to save the event. If you need to edit or delete the event at any point, click the three dots (...) on the right.

Adding traits

Traits are interesting data points that can enhance the usefulness of your data in Faraday, but aren't used to identify a person or an event. For example, color of a product, whether a person owns or rents their home, hobbies, income, etc. These traits can be appended to pipelines, used to create cohorts, used for analysis, etc.

  1. In the dataset's Definition (default) tab, click + Add a trait to get started, which will open the new trait window.

  2. Next, give your trait a name.

  3. Lastly, choose the corresponding field in the dataset. For this example, we have a field in our data called couch_color that lists the color of couch that the customer purchased. We can use this down the line to add a personalized touch to our outreach.

  4. Once you're done adding your property, click Finish to save the trait. If you need to edit or delete the event at any point, click the three dots (...) on the right. Feel free to repeat this process as many times as you'd like for however many traits you think you might use for your predictions.

Once you've finished adding an identity set, event, and/or trait, click Save dataset to save it for use throughout Faraday.