CompOmics
diff --git a/‎README.md‎
Lines changed: 19 additions & 175 deletions b/‎README.md‎
Lines changed: 19 additions & 175 deletions
diff --git a/‎deeplc/__init__.py‎
Lines changed: 9 additions & 1 deletion b/‎deeplc/__init__.py‎
Lines changed: 9 additions & 1 deletion
@@ -5,9 +5,9 @@
 [![Conda](https://img.shields.io/conda/vn/bioconda/deeplc?style=flat-square)](https://bioconda.github.io/recipes/deeplc/README.html)
 [![GitHub Workflow Status](https://flat.badgen.net/github/checks/compomics/deeplc/)](https://github.com/compomics/deeplc/actions/)
 [![License](https://flat.badgen.net/github/license/compomics/deeplc)](https://www.apache.org/licenses/LICENSE-2.0)
-[![Twitter](https://flat.badgen.net/twitter/follow/compomics?icon=twitter)](https://twitter.com/compomics)
 
-DeepLC: Retention time prediction for (modified) peptides using Deep Learning.
+
+DeepLC: Retention time prediction for peptides carrying any modification.
 
 ---
 
@@ -22,21 +22,13 @@ DeepLC: Retention time prediction for (modified) peptides using Deep Learning.
     - [Python module](#python-module)
   - [Input files](#input-files)
   - [Prediction models](#prediction-models)
-- [Q&A](#qa)
-
 ---
 
 ## Introduction
 
-DeepLC is a retention time predictor for (modified) peptides that employs Deep
-Learning. Its strength lies in the fact that it can accurately predict
-retention times for modified peptides, even if hasn't seen said modification
-during training.
+DeepLC is a retention time predictor for peptides. Its strength lies in the fact that it can accurately predict retention times for modified peptides, even if hasn't seen said modification during training.
 
-DeepLC can be used through the
-[web application](https://iomics.ugent.be/deeplc/),
-locally with a graphical user interface (GUI), or as a Python package. In the
-latter case, DeepLC can be used from the command line, or as a Python module.
+DeepLC can be used through the [web application](https://iomics.ugent.be/deeplc/) or as a Python package. In the latter case, DeepLC can be used from the command line, or as a Python module.
 
 ## Citation
 
@@ -53,29 +45,6 @@ If you use DeepLC for your research, please use the following citation:
 Just go to [iomics.ugent.be/deeplc](https://iomics.ugent.be/deeplc/) and get started!
 
 
-### Graphical user interface
-
-#### In an existing Python environment (cross-platform)
-
-1. In your terminal with Python (>=3.7) installed, run `pip install deeplc[gui]`
-2. Start the GUI with the command `deeplc-gui` or `python -m deeplc.gui`
-
-#### Standalone installer (Windows)
-
-[![Download GUI](https://flat.badgen.net/badge/download/GUI/blue)](https://github.com/compomics/DeepLC/releases/latest/)
-
-
-1. Download the DeepLC installer (`DeepLC-...-Windows-64bit.exe`) from the
-[latest release](https://github.com/compomics/DeepLC/releases/latest/)
-2. Execute the installer
-3. If Windows Smartscreen shows a popup window with "Windows protected your PC",
-click on "More info" and then on "Run anyway". You will have to trust us that
-DeepLC does not contain any viruses, or you can check the source code 😉
-4. Go through the installation steps
-5. Start DeepLC!
-
-![GUI screenshot](https://github.com/compomics/DeepLC/raw/master/img/gui-screenshot.png)
-
 
 ### Python package
 
@@ -180,23 +149,24 @@ For a more elaborate example, see
 
 ### Input files
 
-DeepLC expects comma-separated values (CSV) with the following columns:
+DeepLC accepts any PSM file format supported by
+[psm_utils](https://psm-utils.readthedocs.io/en/stable/api/psm_utils.io.html),
+including MaxQuant msms.txt, Sage, MSAmanda, Percolator, and many more. The file
+format is automatically inferred from the file extension, or can be specified
+explicitly with the `--psm-filetype` option.
 
-- `seq`: unmodified peptide sequences
-- `modifications`: MS2PIP-style formatted modifications: Every modification is
-  listed as `location|name`, separated by a pipe (`|`) between the location, the
-  name, and other modifications. `location` is an integer counted starting at 1
-  for the first AA. 0 is reserved for N-terminal modifications, -1 for
-  C-terminal modifications. `name` has to correspond to a Unimod (PSI-MS) name.
-- `tr`: retention time (only required for calibration)
+At a minimum, a tab-separated file with a `peptidoform` and `spectrum_id` column
+is accepted. Peptidoforms must be in
+[ProForma 2.0](https://pubs.acs.org/doi/10.1021/acs.jproteome.1c00771) notation.
+For calibration or fine-tuning, a `retention_time` column is also required.
 
 For example:
 
-```csv
-seq,modifications,tr
-AAGPSLSHTSGGTQSK,,12.1645
-AAINQKLIETGER,6|Acetyl,34.095
-AANDAGYFNDEMAPIEVKTK,12|Oxidation|18|Acetyl,37.3765
+```tsv
+spectrum_id	peptidoform	retention_time
+0	AAGPSLSHTSGGTQSK/2	12.16
+1	AAINQK[Acetyl]LIETGER/2	34.10
+2	AANDAGYFNDEM[Oxidation]APIEVK[Acetyl]TK/3	37.38
 ```
 
 See
@@ -237,130 +207,4 @@ The different parts refer to:
 
 ## Q&A
 
-**__Q: Is it required to indicate fixed modifications in the input file?__**
-
-Yes, even modifications like carbamidomethyl should be in the input file.
-
-**__Q: So DeepLC is able to predict the retention time for any modification?__**
-
-Yes, DeepLC can predict the retention time of any modification. However, if the
-modification is **very** different from the peptides the model has seen during
-training the accuracy might not be satisfactory for you. For example, if the model
-has never seen a phosphor atom before, the accuracy of the prediction is going to
-be low.
-
-**__Q: Installation fails. Why?__**
-
-Please make sure to install DeepLC in a path that does not contain spaces. Run
-the latest LTS version of Ubuntu or Windows 10. Make sure you have enough disk
-space available, surprisingly TensorFlow needs quite a bit of disk space. If
-you are still not able to install DeepLC, please feel free to contact us:
-
-Robbin.Bouwmeester@ugent.be and Ralf.Gabriels@ugent.be
-
-**__Q: I have a special usecase that is not supported. Can you help?__**
-
-Ofcourse, please feel free to contact us:
-
-Robbin.Bouwmeester@ugent.be and Ralf.Gabriels@ugent.be
-
-**__Q: DeepLC runs out of memory. What can I do?__**
-
-You can try to reduce the batch size. DeepLC should be able to run if the batch size is low
-enough, even on machines with only 4 GB of RAM.
-
-**__Q: I have a graphics card, but DeepLC is not using the GPU. Why?__**
-
-For now DeepLC defaults to the CPU instead of the GPU. Clearly, because you want
-to use the GPU, you are a power user :-). If you want to make the most of that expensive
-GPU, you need to change or remove the following line (at the top) in __deeplc.py__:
-
-```
-# Set to force CPU calculations
-os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
-```
-
-Also change the same line in the function __reset_keras()__:
-
-```
-# Set to force CPU calculations
-os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
-```
-
-Either remove the line or change to (where the number indicates the number of GPUs):
-
-```
-# Set to force CPU calculations
-os.environ['CUDA_VISIBLE_DEVICES'] = '1'
-```
-
-**__Q: What modification name should I use?__**
-
-The names from unimod are used. The PSI-MS name is used by default, but the Interim name
-is used as a fall-back if the PSI-MS name is not available. It should be fine as long as it is support by [proforma](https://pubs.acs.org/doi/10.1021/acs.jproteome.1c00771) and [psm_utils](https://github.com/compomics/psm_utils).
-
-**__Q: I have a modification that is not in unimod. How can I add the modification?__**
-
-Unfortunately since the V3.0 this is not possible any more via the GUI or commandline. You will need to use [psm_utils](https://github.com/compomics/psm_utils), above a minimal example is shown where we convert an identification file into a psm_list which is accepted by DeepLC. Here the sequence can for example include just the composition in proforma format (e.g., SEQUEN[Formula:C12H20O2]CE).
-
-**__Q: Help, all my predictions are between [0,10]. Why?__**
-
-It is likely you did not use calibration. No problem, but the retention times for training
-purposes were normalized between [0,10]. This means that you probably need to adjust the
-retention time yourselve after analysis or use a calibration set as the input.
-
-
-**__Q: What does the option `dict_divider` do?__**
-
-This parameter defines the precision to use for fast-lookup of retention times
-for calibration. A value of 10 means a precision of 0.1 (and 100 a precision of
-0.01) between the calibration anchor points. This parameter does not influence
-the precision of the calibration, but setting it too high might mean that there
-is bad selection of the models between anchor points. A safe value is usually
-higher than 10.
-
-
-**__Q: What does the option `split_cal` do?__**
-
-The option `split_cal`, or split calibration, sets number of divisions of the
-chromatogram for piecewise linear calibration. If the value is set to 10 the
-chromatogram is split up into 10 equidistant parts. For each part the median
-value of the calibration peptides is selected. These are the anchor points.
-Between each anchor point a linear fit is made. This option has no effect when
-the pyGAM generalized additive models are used for calibration.
-
-
-**__Q: How does the ensemble part of DeepLC work?__**
-
-Models within the same directory are grouped if they overlap in their name. The overlap
-has to be in their full name, except for the last part of the name after a "_"-character.
-
-The following models will be grouped:
-
-```
-full_hc_dia_fixed_mods_a.hdf5
-full_hc_dia_fixed_mods_b.hdf5
-```
-
-None of the following models will not be grouped:
-
-```
-full_hc_dia_fixed_mods2_a.hdf5
-full_hc_dia_fixed_mods_b.hdf5
-full_hc_dia_fixed_mods_2_b.hdf5
-```
-
-**__Q: I would like to take the ensemble average of multiple models, even if they are trained on different datasets. How can I do this?__**
-
-Feel free to experiment! Models within the same directory are grouped if they overlap in
-their name. The overlap has to be in their full name, except for the last part of the
-name after a "_"-character.
-
-The following models will be grouped:
-
-```
-model_dataset1.hdf5
-model_dataset2.hdf5
-```
-
-So you just need to rename your models.
+See the [FAQ](https://deeplc.readthedocs.io/en/latest/faq.html) in the documentation.
@@ -2,10 +2,18 @@
 
 from importlib.metadata import version
 
-from deeplc.core import finetune, finetune_and_predict, predict, predict_and_calibrate, train
+from deeplc.core import (
+    calibrate,
+    finetune,
+    finetune_and_predict,
+    predict,
+    predict_and_calibrate,
+    train,
+)
 
 __version__: str = version("deeplc")
 __all__: list[str] = [
+    "calibrate",
     "predict",
     "predict_and_calibrate",
     "finetune_and_predict",