\scInstance <- [ \scInstance : handle, {[#:key, $:value]} : data, $ : linkage, $ : distance, {[$:event, &:weight]} : eventWeights ]: scConstruct;

This function initializes the clustering plugin if called with an
empty `handle`

parameter
(`scNoInstance`

). If called with a
non-empty `handle`

parameter, the plugin will be
initialized from an on-disk cache.

The data to be clustered needs to be passed as a set of numbered strings, eg.:

{[1, "123321"], [2, "122311"]}

Note that the numbers (keys) have to be consecutive and starting with 1.

The `linkage`

parameter describes the type of
linkage used by the algorithm. The allowed values are: "single",
"maximum", "average".

The `distance`

parameter describes the distance
function to be used. The allowed values are: "levenshtein",
"damereau", "journey", "event_histogram".

The `eventWeights`

parameter contains the event
weights which have effect on some distance measures. Each event
type should be coded as a single-letter string, and weights should
be real numbers. For example:

{["1", 1.0], ["2", 0.5], ["3", 10.0]}

You can also pass an empty set (`{}`

) as event
weights. In this case, all events will have weight 1.0.

{[#:cluster, #:key]} <- [\scInstance, #:nCluster] : scGetClusters;

Cluster the data in `nCluster`

clusters and return
cluster number for each data element
(key). If `nCluster`

has a special value
of `scBestNClusters`

the algorithm will choose the
best number of clusters for this dataset, based on silhouette
analysis.

{[#:cluster, #:key]} <- [\scInstance, #:nCluster] : scGetMedoids;

Cluster the data in `nCluster`

clusters and return
the key of the medoid of each cluster.
If `nCluster`

has a special value
of `scBestNClusters`

the algorithm will choose the
best number of clusters for this dataset, based on silhouette
analysis.

{#} <- \scInstance : scGetBestNClusters;

Return the 4 best numbers of clusters for this dataset. The first number in the array is always the best clustering.

\scInstance <- [\scInstance, #:nCluster, #:cluster] : scDrillDown;

Cluster the data in `nCluster`

clusters and then
drill down into the `cluster`

cluster. From now on
all above function calls will work only on the data from the selected
cluster.

\scInstance <- \scInstance : scUndrill;

Reset the drilling. From now on all above function calls will work on the complete dataset.

{[#: key1, #: key2, &: dist]} <- \scInstance : scGetDistanceTab;

Return the distance table for the current dataset and distance measure;