Notebooks tie together your Datasets and your Functions: you use notebooks to transform and analyse your datasets using custom Python and SQL or a function from NStack's library. A notebook runs in a scalable, distributed environment with error handling, logging, and schema inference.
If you are on a free plan, there is a limit of 2 minutes to execute your function. If you are on a paid plan, this goes up to 20 minutes.
Notebooks cannot be run anonymously, so you will need to create an NStack account to use them.
How to use
- On the Notebook tab, search for a Dataset to use. This will load it into the first "cell" of the notebook, where you can view the data, or see the properties and metadata of your dataset.
- In the second cell, you can either choose to write a SQL or Python snippet, or select an existing function from your library.
- Click the Run Notebook button. This will either create a new cell with your output dataset, or give you an error message.
Once you have successfully run a Notebook, you can choose to save your function or your dataset for reuse in the future.
SQL allows you to treat your dataset like a database and use SQL to
SELECT data from it. NStack's SQL syntax is ANSI SQL, where all columns in your dataset are available on a table named
DATASET. For instance, if you had two columns,
price in your dataset, you could run the following query:
SELECT id FROM DATASET WHERE price > 5
The default query in the query input is
SELECT * FROM DATASET. To use this transformer, change this query and click the Run Query button. The output section below will be populated with either your query or an error message.
This transformer allows you to run a 'serverless' Python 3 function on your dataset. In the code input box, you will see a prepopulated Python method,
snippet. You can change this to be any valid Python code.
By default, this Python environment includes pandas, sklearn, and numpy.
import pandas as pd def snippet(df: pd.DataFrame) -> pd.DataFrame: return df.describe(include='all').reset_index()
Your dataset is passed to this method as a single pandas dataframe, and the method returns a single pandas dataframe, which becomes the output you see below. Inside this function, you can import both
sklearn. Similarly to the SQL transformer, you will receive either output data or errors in the section below.