Like what you see? Have a play with our trial version.

Error rendering macro 'rw-search'

null

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

 

The following functions are available as utility functions: 

 

 

DataSource Function Definitions 

...

Code Block
languagejava
public ScheduleDefinition getScheduleDefinition() { 

		return new ScheduleDefinition("MINUTES", null, 60); 

} 

 

 

Anchor
autorun
autorun

public boolean autoRun(); 

autoRun is the called to perform any background tasks. This function is called based on the getScheduleDefinition(). This could be used to download and cache data locally. 

 

Anchor
loadblob
loadblob

protected final byte[] loadBlob(String key); 

loadBlob() will load a blob (byte[]) that was previously saved by the connector, usually in a background task. The parameter key is a unique identifier for the data to load. Blobs can only be loaded on data sources that have been saved. areBlobsAvailable() can be used to see if blob access is available. 

 

Anchor
saveblob
saveblob

protected final boolean saveBlob(String key, byte[] data); 

saveBlob() allows for saving a blob (byte[]) for later use. This is a way of saving data from background tasks for later use. The parameter key is the unique identifier for the data to be saved. data is the byte[] to be associated with that key. Writing null to data will delete the saved data for the specified key. Blobs can only be loaded on data sources that have been saved. areBlobsAvailable() can be used to see if blob access is available. 

 

Anchor
areblobsavailable
areblobsavailable

protected final boolean areBlobsAvailable(); 

Blobs can only be loaded on data sources that have been saved. areBlobsAvailable() can be used to see if blob access is available. Blob access will not be available if a connector is tested prior to being saved. 

 

Anchor
getattribute
getattribute

public final Object getAttribute(String key); 

getAttribute() allows for fetching attributes from the connection meta-data. For example, a Username may be specified for the connection through the Yellowfin UI. Using the key of the parameter, the contents of the Username meta-data field can be fetched for use when retrieving data from external APIs. 

 

Anchor
getsourceid
getsourceid

public final Integer getSourceId(); 

getSourceId() can be used to fetch the unique internal id of source that this connector is associated with. This may be helpful for segregating data by connection in some kind of external cache or database. 

 

 

Recommendations for using saveBlob() and loadBlob() 

 

Minimise Stored/Cached Data 

It is recommended to only store data that cannot be retrieved from an external source reliably. This could be in the case of “sliding window” access to data, where only a limited amount of historical data is available, and this needs to be downloaded prior to it becoming unavailable. Extremely slow data sets can also use locally stored data sets to improve query speed. 

If significant amounts of data are stored in the blob system, it is recommended to truncate the data after a certain period. This might mean deleting all data when it reaches a certain age, or storing less granular information for older data. For instance, store raw data for three months, daily aggregated data for one year, and weekly aggregated data for older data. To achieve this, a background job would need to re-aggregate and re-store the data. 

 

Minimise Blob Size 

There is significant load induced on the Yellowfin database and server when storing and loading large sized blobs. If possible, distribute stored data across multiple blobs. 

For instance, there may be an instance where 100,000 tweets are stored in the blob storage system. This might be stored with the key “ALL_TWEETS”. However to minimize loading times of blobs, and to not overload the caching system, this could be split and stored in smaller chunks. 

One way to do this would be to split up tweets by month: 

     "201601_TWEETS" 

     "201602_TWEETS" 

     "201603_TWEETS" 

     "201604_TWEETS" 

 

When a query is requested from the connector, filters can be used to determine which blobs need to be used, and thus loaded from the blob storage system. A query with the specified date range of 2016-02-05 to 2016-03-05 would just need to load the data for February and March, “201602_TWEETS” and “201603_TWEETS”. 

There may be significant overhead required to join datasets from blobs. It is recommended that this be taken into consideration and to compare the performance of multiple smaller blobs versus larger ones. 

There is no ideal size for blobs. The loading speed of blobs from the Yellowfin database will be dependent on the hardware and DBMS used. Public connectors will be used on Yellowfin installations of all sizes, so smaller, less powerful systems should be taken into consideration. 

 

 

Application-Level Filtering & Aggregations 

Yellowfin supports application-level filtering and aggregation. This allows Yellowfin to aggregate and filter data after receiving a result set from a connector. 

Application-level aggregation is toggled based on the capabilities of the connector. If any DataSet columns (as returned by getColumns()) support native aggregations (where the connector returns aggregated data), then aggregations will be available. 

Application-level filtering is also toggled based on the capabilities of columns in the connector. If any DataSet columns (as returned by getColumns()) support native filtering (where the connector applies its own filters), then the application-level filtering will be disabled, and only connector column filters will be available. Connector filters (as returned by getFilters()) can co-exist with application-level filters. 

 

 

Custom Error Messages 

Custom messages can be returned to the Yellowfin UI if an error occurs whilst running a connector report. This can be done by throwing a ThirdPartyException() with a custom message from the connector plugin. 

Code Block
languagejava
throw new ThirdPartyException("Unable to connect to the Twitter API at this time."); 

 

The custom error message will be shown as a standard “Oh-No” error where the report would usually be rendered. This would usually be thrown from the execute() function on a DataSet.