package molenc

  1. Overview
  2. Docs
Molecular encoder/featurizer using rdkit and OCaml

Install

Authors

Maintainers

Sources

v16.17.1.tar.gz
md5=8ea03f3c39b542811828a5705e82c9a6

Description

Chemical fingerprints are lossy encodings of molecules. molenc allows to encode molecules using unfolded-counted fingerprints (i.e. a potentially very long but sparse vector of positive integers).

Currently, Faulon fingerprints and atom pairs are supported.

Currently, atom types are the quadruplet (#pi-electrons, element symbol, #HA neighbors, formal charge). In the future, pharmacophore features might be supported (a more abstract/fuzzy atom typing scheme).

Bibliography:

Faulon, J. L., Visco, D. P., & Pophale, R. S. (2003). The signature molecular descriptor.

  1. Using extended valence sequences in QSAR and QSPR studies. Journal of chemical information and computer sciences, 43(3), 707-720.

Carhart, R. E., Smith, D. H., & Venkataraghavan, R. (1985). Atom pairs as molecular features in structure-activity studies: definition and applications. Journal of Chemical Information and Computer Sciences, 25(2), 64-73.

Kearsley, S. K., Sallamack, S., Fluder, E. M., Andose, J. D., Mosley, R. T., & Sheridan, R. P. (1996). Chemical similarity using physiochemical property descriptors. Journal of Chemical Information and Computer Sciences, 36(1), 118-127.

OpenSMILES specification. Craig A. James et. al. v1.0 2016-05-15. http://opensmiles.org/opensmiles.html

Published: 13 Feb 2023

Dependencies (16)

  1. pyml >= "20211015"
  2. vector3
  3. parany >= "12.1.1"
  4. ocamlgraph
  5. ocaml >= "5.0"
  6. minicli >= "5.0.0"
  7. line_oriented
  8. dune >= "1.11"
  9. dolog >= "5.0.0"
  10. dokeysto
  11. cpm >= "9.0.0"
  12. conf-rdkit
  13. conf-python-3
  14. conf-graphviz
  15. bst >= "2.0.0"
  16. batteries >= "3.5.0"

Dev Dependencies

None

Used by (5)

  1. hts_shrink >= "3.0.1"
  2. linwrap >= "9.0.3"
  3. oranger >= "3.0.1"
  4. rankers
  5. svmwrap

Conflicts

None