Pipelines - Docs

Pipelines overview

Pipelines (called scopes in the API) are how you choose which predictions you want to deploy, who you want to make those predictions on, and where they should go. When building one, you'll start with the population. Here, you'll choose which of your cohorts should be the population to include, or who you want to make predictions on. You can also optionally choose a population to exclude if you'd like to prevent a cohort from having predictions made.

Next, the payload is where you choose which predictions you'd like to make: outcomes, persona sets, recommenders, and in some cases, cohorts (see membership indicators). Additionally, you can enable prediction explanations via the Dashboard toggle or the explainability parameter in createscope or updatescope.

After your pipeline is created, you'll deploy your predictions by creating a deployment (called a target in the API) so that your predictions can be delivered to any connection you've made to a database, data warehouse, cloud bucket, a Faraday-managed connection, or to a CSV file. When you enable a pipeline via the "enable" toggle or the preview parameter of the updatescope API call, it will be kept up-to-date.

🚧️Editing a pipeline

If a pipeline is edited, it will disable and will need to be re-enabled afterward.

For pipeline creation instructions using both the Dashboard UI and API, see our how-to docs for use cases.

👍Key takeaway: pipelines

Pipelines are how you choose which predictions you want to deploy through the payload, who you want to make those predictions on through the population, and where they should go through the deployment target. Pipelines are customized through a combination of cohorts and predictions (outcomes, persona sets, recommenders), and deploy to either a connection or a CSV file.

Additional pipeline information

Deployments to Faraday-managed connections

Faraday-managed connections can be created to either to pull data from or deploy predictions to other software within your stack, such as an ESP, CRM, or ad platform. They are initiated in connections, and after creating a deployment using a Faraday-managed connections, please create a support ticket so that a Faraday support team member can initiate the deployment on Faraday's side.

Membership indicators

Membership indicators, or including cohorts in your payload, are useful for segmentation. For example, say you want to know who in your customer base has an income of greater than $100k. Your population to include for your pipeline would be your Customers cohort, and as part of your payload, you could then select a cohort of customers with an income of greater than $100k. As a result, anyone who is indicated as being in that $100k or greater cohort in your pipeline is a current customer with more than $100k in income.

Eligibility restrictions

If the outcome you select in your payload specified an eligibility cohort, your pipeline is not restricted by that same eligibility cohort. For example, if your outcome's eligibility cohort is Leads (e.g. in a lead scoring outcome), and your pipeline's population to include is Everyone, then everyone will be scored–not just your leads.

Deployment formatting

Deployments (targets) can be created in the below formats via the Dashboard or the representation parameter in the createtarget API request:

Hashed (default): Best for deploying audiences to ad platforms. Data is hashed, and not human-readable.
Referenced: Best for merging data back into your stack. Uses a reference key defined in your dataset's advanced options to identify unique rows. If a reference key is not defined, this option is unavailable to select.
Identified: Best for direct mail and canvassing campaigns. Data is unhashed and human-readable.
Aggregated: Best for geotargeted ad campaigns. Select this to see the number of people in each payload element (outcome, persona, cohort) within the area of the geographic type you select.

Column headers

Additionally, you may select whether you'd like machine friendly or human friendly column headers via the Dashboard or the human_readable parameter in the createtarget API request:

Machine friendly: Best for automated systems where consistent naming is relevant.
Human friendly: Best for convenient, easy-to-read interpretation. Using human friendly makes your column headers instantly recognizable for what they are by including the outcome name and prediction type. This can help make your predictions easier to identify when deploying to ESPs, CRMs, etc, where you'll want to quickly be able to see a contact's persona or propensity score on their contact card.

Deployment filter

Deployment filters enable you to filter by the persona sets, outcomes, and cohort memberships you selected for your pipeline's payload. This is accomplished both in the Dashboard and the API via the filter parameter in the createtarget API request.

Filtering by a persona set allows you to target specific personas within a persona set, e.g. selecting a persona set and choosing the "equal to" operator on a specific persona, will only include that persona in the deployment.

Filtering by a recommender allows you to choose the highest-ranking recommendations within a given recommender.

Filtering by an outcome allows you to target a percent range of rows by percentile or score, enabling you to focus on only the people that matter most to you.

Outcome percentile is a whole integer between 1 and 100 (inclusive), and refers to the percentile of the outcome score distribution. The number of individuals in each percentile varies; as a rough estimate, the top 10 score percentiles correspond to the 10% of the population. For example, entering greater than or equal to 81 would filter the top 20% of the population scored.
Outcome probability refers to the estimated probability of the outcome and is a decimal from 0 to 1. To correctly enter in a score include the decimal point. For example, a score of .5 would be entered as 0.5, and reflects a 50% probability that an individual will achieve the outcome.

📘Further reading: Faraday scoring

For further reading on Faraday scoring, see Propensity vs probability: Understanding the difference between raw scores and probabilities.

Deployment limit

Deployment limits allow you to specify whether or not you'd like to limit your results by a top count of rows or a bottom count of rows, via the Dashboard and the API via the limit parameter in the createtarget API request.

Only the top/bottom (count) enables an exact number of rows to export.

📘Additional limit info

This limit refers only to rows and not necessarily to individuals. For hashed targets in particular, there are likely to be 2-3 duplicate rows per person (one per email and physical address).

📘Large pipelines

For larger pipeline sizes (20M+), the ordering is approximate and may not precisely represent the very top/bottom scoring individuals.

Structure

Deployment structure allows you to rename and reorder columns. Renaming them can make it even more convenient when importing your data into your activation platform. For ad platform deployments like LinkedIn, Facebook, and Google Ads, selecting the appropriate option in the dropdown in Dashboard or custom_structure parameter in the API's in the createtarget API request enables you organize the file in a way that's convenient for upload to that platform.

📘Column naming conventions

Column names don't allow spaces, so if you receive an error when saving, check that you don't have any spaces in renamed columns. Instead of "Faraday propensity score," try "faraday_propensity_score."

Connection-specific

In this last settings option, you'll see format for hosted CSV deployments, or settings specific to the connection if you're deploying back to your database.

📘Advanced settings

These connection-specific settings are only recommended for advanced users and can safely be ignored otherwise.

Understanding deployment columns

A deployment in Faraday will include various points of data about your customers. When creating a deployment, in the structure section, you can select pre-formatted outputs for various destinations like Facebook, LinkedIn, and Google Ads to save the time & effort of formatting it yourself. In hashed deployments, personally identifiable information (PII) will be replaced by a hash key.

Hashed, identified, and referenced deployments

Column name	Definition	Additional info
row_id	Faraday's internal key.
person_first_name	First name of the individual.
person_last_name	Last name of the individual.
house_number_and_street	Physical address of the individual.
city	City the individual resides in.
state	State the individual resides in.
postcode	Postcode/Zip code the individual resides in.
email	Email address of the individual in Faraday's data.
fdy_persona_set_persona_id	The ID of the persona set in which this individual's persona exists (not the persona itself).
fdy_persona_set_persona_name	The name of the persona that the individual belongs to.
fdy_outcome_propensity_score	Absolute score (scale of 0.0-1.0) of the individual based on the model used.	Propensity scores below 0.5 indicate the predictive model is leaning toward the individual not achieving the outcome, and vice versa.
fdy_outcome_propensity_percentile	Relative rank of the individual's score among all values.	1=lowest, 100=highest. To get the top 1% of scores, you want percentiles 99–100.
fdy_outcome_propensity_probability	Absolute score (scale of 0.0-1.0) of the individual based on the model used.	Probability scores indicate the likelihood of an individual to achieve the outcome. A score of 0.75 indicates they have a 75% chance of achieving it.

Aggregated deployments

Column name	Definition	Additional info
County	The aggregation level selected when creating the deployment. Can be county, metro, state, or zipcode.
Metro	The aggregation level selected when creating the deployment. Can be county, metro, state, or zipcode.
State	The aggregation level selected when creating the deployment. Can be county, metro, state, or zipcode.
Zipcode	The aggregation level selected when creating the deployment.	Can be county, metro, state, or zipcode.
[count or avg]_fdy_outcome_propensity_score	The total number of people in the location based on selected deployment filters (count) or the average score of people in this aggregated location based on selected deployment filters (avg).	Avg is the absolute score (scale of 0.0-1.0) of the individuals based on the model used. Propensity scores below 0.5 indicate the predictive model is leaning toward the individuals not achieving the outcome, and vice versa.
[count or avg]_fdy_outcome_propensity_percentile	The total number of people in the location based on selected deployment filters (count) or the average percentile of people in this aggregated location based on selected deployment filters (avg).	Relative rank of the individual's score among all values. Average percentile: 1=lowest, 100=highest. To get the top 1% of scores, you want percentiles 99–100.
[count or avg]_fdy_outcome_propensity_probability	The total number of people in the location based on selected deployment filters (count) or the average probability of people in this aggregated location based on selected deployment filters (avg).	Absolute score (scale of 0.0-1.0) of the individual based on the model used. Probability scores indicate the likelihood of an individual to achieve the outcome. A score of 0.75 indicates they have a 75% chance of achieving it.

Understanding score explainability

When adding a payload to your pipeline, you can tick the checkbox include prediction explanations to add score explainability your deployments. These explanations detail which traits had the highest impact in calculating each individual's predicted score.

Above, we see an example of CSV output from a pipeline. John is impacted by this outcome’s major factors, age and number of children, and leaves him with a low probability of converting.

Jane, on the other hand, exhibits a more unusual combination of traits that give her a much higher conversion probability for different reasons. For Jane, both her household income and millennial lifestyle saw her conversion probability higher than John because this business often sees conversions from people with those traits–even if they’re not the dominant traits.

Score explainability can help you understand–down to the individual level–what traits in your data are influencing how likely (or unlikely) individuals are of achieving your predictive outcomes.

📘Score explainability headers

The above image contains simplified column headers for the sake of this example. Your output might look something like "fdy_outcome_lead_conversion_propensity_explanation" if you select human-friendly column headers, or have a hashed value in place of the outcome name for machine-friendly. Column headers can be edited via the structure advanced setting while configuring a deployment.

Deleting a pipeline

Before deleting a pipeline, ensure that any deployments have first been deleted. Once there are no deployments using the persona set, you can safely delete it.

Dashboard: click the options menu (three dots) on the far right of the pipeline you'd like to delete, or upper right when viewing a pipeline, then click delete.
API: use the delete scope API request.

📘Deleting resources

See object preservation for info on the order in which resources should be deleted.