Aerial.utils 1.2.0
Utility 'tool belt' of functions for common tasks; trees; clustering; probability, stats, and information theory; et.al.
Topics
Namespaces
aerial.utils.coll
Various supplementary collection functions not included in the dstandard Clojure ecosystem. Mostly for seqs, but also for vectors, maps and sets.
Public variables and functions:
- coalesce-xy-yx
- concatv
- drop-until
- dropv
- dropv-until
- dropv-while
- ensure-vec
- in
- map->csv-map
- map-entry?
- merge-with*
- mkseq
- partitionv-all
- pos
- pos-any
- positions
- pxmap
- random-subset
- reducem
- rotate
- rotations
- separate
- separatev
- sliding-take
- splitv-at
- subsets
- take-until
- take-until-nochange
- takev
- takev-until
- takev-while
- third
- transpose
- vfold
- xfold
- xprod
- xprod-rng1k
aerial.utils.ds.bktrees
BK-trees: Metric trees over discrete spaces. Parameterizable with various (not necessarily discrete) metrics and arbitrary values.
Public variables and functions:
aerial.utils.ds.graphs
Various graph algorithms, techniques, functions. Generally applicable, but typically used for sequence similarity, clustering, and path metrics of various sorts.
Public variables and functions:
aerial.utils.io
Various supplementary I/O functions and macros. Some new, some recapitulations of previous extremely useful functionality that is no longer included in the standard Clojure ecosystem.
Public variables and functions:
- ->auioReader
- add-rdrwtr
- assemble-bindings
- auio-rdrs
- close-stream
- do-text-file
- do-text-to-text
- fd-use
- force-gc-finalize
- get-xform-body-bindings
- get-xform-rdrwrtr-bindings
- letio
- map->auioReader
- nss-set
- open-file
- open-streaming-gzip
- process-line
- rdrwtrs-set
- read-lines
- read-stream
- reduce-file
- rwfl-syms
- with-in-reader
- with-out-append-writer
- with-out-writer
- write-lines
- write-stream
aerial.utils.math
Various math functions that don't quite have enough totality to have their own home namespace just yet.
aerial.utils.math.clustering
Various data clustering algorithms, techniques, functions. Generally applicable, but typically used for sequence clustering of various sorts.
Public variables and functions:
- center-dist-expect
- center-distances
- centers
- cluster-distances
- cluster-stdev
- clusters
- davies-bouldin-index
- DBI-Rij
- DBI-Si
- density
- dist-matrix
- edist
- extreme-pd
- farthest
- farthest-pd
- find-clusters
- intercluster-density
- intra-dist-expect
- intra-distances
- ith-sum-sqr-err
- kmeans
- kmeans++
- knn
- knn-graph
- krnn-clust
- krnn-graph
- loyd-step
- nearest
- nearest-pd
- refoldin-outliers
- S-Dbw-index
- scatt
- split-krnn
- split-worst-cluster
- sum-sqr-err
aerial.utils.math.combinatorics
Various supplementary math functions centering on combinatorics that are not supplied by clojure.math.combinatorics. Effectively an extension of that library.
Public variables and functions:
aerial.utils.math.infoth
Various Information Theory functions and measures rooted in Shannon Entropy measure and targetting a variety of sequence and string data.
Public variables and functions:
- all-grams
- bi-tri-grams
- combin-joint-entropy
- cond-entropy
- conditional-mutual-information
- CREl
- dice-coeff
- diff-fn
- DLX||Y
- DX||Y
- entropy
- expected-qdict
- freq-jaccard-index
- freq-xdict-dict
- hamming
- HXY
- HX|Y
- hybrid-dictionary
- II
- information-capacity
- informativity
- interaction-information
- IXY
- IXY|Z
- jaccard-dist
- jaccard-index
- jensen-shannon
- joint-entropy
- KLD
- lambda-divergence
- levenshtein
- limit-entropy
- limit-informativity
- lod-score
- log-odds
- max-qdict-entropy
- mutual-information
- ngram-compare
- ngram-vec
- normed-codepoints
- q-1-dict
- q1-xdict-dict
- raw-lod-score
- reconstruct-dict
- relative-entropy
- seq-joint-entropy
- shannon-entropy
- TCI
- total-correlation
- tversky-index
- variation-information
aerial.utils.math.probs-stats
Various frequency, combinatorial, probability, statistical, measure, and metrics for a variety of sequence and string data.
Public variables and functions:
- alphabet2
- avg-std-deviation
- avg-variance
- binomial-dist
- binomial-pdf
- cc-combins-freqn
- cc-combins-freqs-probs
- cc-freqn
- cc-freqs
- cc-freqs&probs
- cc-freqs-probs
- cc-tfreqn
- cc-tfreqs
- choose-k-freqn
- combin-count-reduction
- combins-freqn
- combins-freqs-probs
- cond-probability
- correlation
- covariance
- flatten-pair-coll
- freqn
- freqs&probs
- freqs-probs
- geometric-dist
- geometric-pdf
- joint-prob-x
- joint-probability
- JPSxy
- JPxy
- keysort
- letter-pairs
- mean
- median
- num-key-freq-map?
- p-value
- pair-coll?
- pdsum
- pearson-correlation
- poisson-cdf
- poisson-dist
- poisson-pdf
- poisson-sample
- probs
- pXY|y
- sampling
- std-deviation
- variance
- word-letter-pairs
aerial.utils.math.scores
Various data test and analysis performance scoring.
Public variables and functions:
aerial.utils.misc
General utility functions and macros. Basically these resources are fairly general and intended to be usable on most any part of most any project
Public variables and functions:
aerial.utils.string
Various supplementary string functions not included in the standard Clojure ecosystem or some that are but which now have broken argument order in terms of threading. Most of this stuff really should be in clojure.string, but for unknown reasons isn't.
Public variables and functions: