Prepare Your Dataset

In our previous post, we guided you through importing your data in different ways. Now, let’s talk about preparing that data in what we call dataset.

Creating and managing dataset will be important to maintain high data quality and build the processes that will bring you value. The size and purpose of dataset is for you to decide, will it be one dataset per jurisdiction, client type, industry or all-in-one? The choice is yours, based on how you will use the platform. Feel free to write to use for advice in our channel.

Ready to get started? Let's create beautiful datasets.

Data Selection: Ensuring Relevance and Accuracy

Right after choosing your data import method to spektr, you have the opportunity to review and select your data attributes. This phase is critical for ensuring that all necessary fields are present and eventually only select the relevant ones for your specific use case. Additionally, it is here you need to make sure that each field is correctly identified in terms of its data type.

Select your attributes

Spektr Connection Engine “discovers” the data you want to import and present all fields in a view for you to verify. During this step you can make sure the necessary data is there and select only the relevant fields in order to:

Reduce complexity: Eliminate unnecessary data that can complicate your processes.

Enhance focus: Concentrate on the data that directly contributes to your goals.
Improve performance: Streamline operations by working with relevant data.

Assign the right data type

While you're selecting the necessary data fields, be mindful to verify and set the correct data types. Spektr discovery function has had a go at the job but it is essential for building processes to have the correct type, hence a good pair of eyes on this task isn’t a waste of time.

Selecting the correct data types for your attributes is important because it will determine which type of rule you can apply to a given attribute in your process. For example, if Date of Incorporation is identified as a date you will be able to build rules on this attributes like "in the last 10 years" or "after 2020"...

These are the data type you can choose for you attributes when building your dataset:

Number: for numerical input, for example "First year intend to invest" = 10,000 EUR.
Boolean: for true/false input, for example "Is Pep" = True. Please note that 1 / 0 (numbers), “1” / “0" (strings), “true” / “false” (strings) are currently not supported as booleans.
Date: for date input, see notes about it here.
Country: for country input, see notes about it here.
String: for everything else, for example, "Industry Type" = "Pharmaceutical".

Dataset Identifiers

In spektr, we present a lot of information about the result of your processes and mostly decisions they've taken on you clients. For this, we need to identify your clients via what we call identifiers and that is the last thing you need to do before being ready to build your processes.

Select identifiers

You can select maximum two identifiers in your dataset. In most case "First Name" and "Last Name" will be the usual suspect for a dataset with individual informations. An attribute called "Company Name" or something similar will most likely be the one to go for if your dataset stores entity information.

Next, let's build some processes.

For suggestions and questions, feel free to contact us at support@spektr.com or write directly in our channel.

Build a Risk Assessment Process

Learn how to build client risk assessment through highly customisable processes

Design an Ongoing Monitoring Set-up

Explore how to create monitoring setups.