# 3. Overview of input and output files¶

Input/output files depend on the external code used for structure relaxation.

An important technical element of our philosophy is the multi-stage
strategy for structure relaxation. Final structures and energies must be
high-quality, in order to correctly drive evolution. Most of the newly
generated structures are far from local minimum and their high-quality
relaxation is extremely expensive. This cost can be offset if the first
stages of relaxation are done with cruder computational conditions —
only at the last stages is there a need for high-quality calculations.
The first stages of structure relaxation can be performed with cheaper
approaches or lower computational conditions (basis set,
*k*-points sampling, pseudopotentials) or level of approximation
(forcefields *vs.* LDA *vs.* GGA) and even different structure
relaxation code (see Section 2.5
for a list of supported codes) during structure relaxation of each
candidate structure. We strongly suggest you initially optimize the cell
shape and atomic positions at constant unit cell volume, and only then
perform full optimization of all structural variables. While optimizing
at constant volume, you do not need to worry about Pulay stresses in
plane-wave calculations — it is OK to use a small basis set; however,
for variable-cell relaxation you will need a high-quality basis set. For
structure relaxation, you can often get away with a small set of
*k*-points — but don’t forget to sufficiently increase this at the
last stage(s) of structure relaxation, to get accurate energies. Use
your (and our) wisdom, be a strategist, and remember that poor
relaxation can ruin your results.

## 3.1. Input files¶

Suppose that the directory where the calculations are performed is `~/StructurePrediction`

.
This directory will contain:

file

`input.uspex`

, thoroughly described in Section 5.Subdirectory

`~/StructurePrediction/Specific/`

with VASP, GULP,*etc.*executables, and enumerated input files for structure relaxation —`INCAR_1`

,`INCAR_2`

, …, and pseudopotentials. You can actually alter this filenames (see Section 5.7)Subdirectory

`~/StructurePrediction/Seeds/`

contains files with seed structures. Seed structures should be in VASP5 POSCAR format grouped into folders. Which folder will be used for which generation is specified in Section 5.5.14.Files with molecule definitions (see Section 5.9).

Files with environment definitions (see Section 5.10).

### 3.1.1. `Specific`

folder¶

Executables and enumerated input files for structure relaxation (using external codes, like VASP, GULP, …) should be put in subdirectory

For VASP, put files

`INCAR_1`

,`INCAR_2`

, …,*etc.*, defining how relaxation and energy calculations will be performed at each stage of relaxation (we recommend at least 3 stages of relaxation), and the corresponding`POTCAR_*`

files with pseudopotentials.*E.g.*,`INCAR_1`

and`INCAR_2`

perform very crude structure relaxation of both atomic positions and cell parameters, keeping the volume fixed,`INCAR_3`

performs full structure relaxation under constant pressure with medium precision,`INCAR_4`

performs very accurate calculations. Each higher-level structure relaxation starts from the results of a lower-level optimization and improves them. files of all relevant elements should also be in`Specific`

folder, for instance`POTCAR_O`

,`POTCAR_C`

,*etc.*For GULP, files

`goptions_1`

,`goptions_2`

, … and ,`ginput_1`

,`ginput_2`

, … must be present. The former specify what kind of optimization is performed, the latter specify the details (interatomic potentials, pressure, temperature, number of relaxation iterations,*etc.*).For Quantum Espresso, files

`qEspresso options 1`

,`qEspresso options 1`

, …, must be present. All files should be the normal QE input files with all parameters except atom coordinates, cell parameters and \(k\)-points (these will be written by USPEX at the end of the file). We recommend performing a multi-step relaxation. For instance`qEspresso options 1`

, does a crude structure relaxation of atomic positions with fixed cell parameters,`qEspresso options 2`

does full structure relaxation under constant external pressure with medium precision;`qEspresso options 3`

and does very accurate calculations.

#### 3.1.1.1. `INCAR_*`

files in `Specific/`

folder for VASP¶

To use USPEX correctly, you should carefully edit the files in `Specific/`

folder
to control the structure relaxation in USPEX. We take example of VASP as an
external code:

Your final structures have to be well relaxed, and energies — precise. The point is that your energy ranking has to be correct (to check this, look at

`E_series.pdf`

file in the output).Your

`POTCAR`

files: To yield correct results, the cores of your pseudopotentials (or PAW potentials) should not overlap by more than 10–15%.To have accurate relaxation at low cost, use the multistage relaxation with at least three stages of relaxation for each structure,

*i.e.*at least three`INCAR`

files (`INCAR_1`

,`INCAR_2`

,`INCAR_3`

, …). We usually set 4–5 stages of relaxation.Your initial structures will be usually very far from local minima, in such cases it helps to relax atoms and cell shape at constant volume first (

`ISIF`

= 4 in`INCAR_1,2`

), then do full relaxation (`ISIF`

= 3 in`INCAR_3,4`

), and finish with a very accurate single-point calculation (`ISIF`

= 2 and`NSW`

= 0 in`INCAR_5`

).**Exceptions:**when you do fixed-cell predictions, and also in evolutionary metadynamics (except full relaxation) you must have`ISIF`

= 2.When your volume does not change, you can use default plane wave cutoff. When you optimize cell voluem (

`ISIF`

= 3), you must increase it by 30–40%, otherwise you get a large Pulay stress. Also your convergence criteria can be loose in the beginning, but have to be tight in the end:*e.g.*,`EDIFF`

= 1e-2 and`EDIFFG`

= 1e-1 in`INCAR_1`

, gradually tightening to`EDIFF`

= 1e-4 and`EDIFFG`

= 1e-3 in`INCAR_4`

. The maximum number of iterations (`NSW`

) should be sufficiently large to enable good relaxation, but not too large to avoid wasting computer time on poor configurations. The larger your system, the larger`NSW`

should be.Choosing an efficient relaxation algorithm can save a lot of time. In VASP, we recommend to start relaxation with conjugate gradients (

`IBRION`

= 2 and`POTIM`

= 0.02) and when the structure is closer to local minimum, switch to`IBRION`

= 1 and`POTIM`

= 0.3.Even if you study an insulating system, many configurations that you will sample are going to be metallic, so to have well-converged results, you must use “metallic” treatment — which works both for metals and insulators. We recommend the Methfessel-Paxton smearing scheme (

`ISMEAR`

= 1). For a clearly metallic system, use`ISMEAR`

= 1 and`SIGMA`

= 0.1–0.2. For a clearly insulating system, we recommend`ISMEAR`

= 1 and`SIGMA`

starting at 0.1 (`INCAR_1`

) and decreasing to 0.05.

Here we provide an example of files for carbon with 16 atoms in the unit
cell, with default `ENCUT`

= 400 eV in `POTCAR`

:

```
INCAR_1:
PREC=LOW
EDIFF=1e-2
EDIFFG=1e-1
NSW=65
ISIF=4
IBRION=2
POTIM=0.02
ISMEAR=1
SIGMA=0.10
```

```
INCAR_2:
PREC=NORMAL
EDIFF=1e-3
EDIFFG=1e-2
NSW=55
ISIF=4
IBRION=1
POTIM=0.30
ISMEAR=1
SIGMA=0.08
```

```
INCAR_3:
PREC=NORMAL
EDIFF=1e-3
EDIFFG=1e-2
ENCUT=520.0
NSW=65
ISIF=3
IBRION=2
POTIM=0.02
ISMEAR=1
SIGMA=0.07
```

```
INCAR_4:
PREC=NORMAL
EDIFF=1e-4
EDIFFG=1e-3
ENCUT=600.0
NSW=55
ISIF=3
IBRION=1
POTIM=0.30
ISMEAR=1
SIGMA=0.06
```

```
INCAR_5:
PREC=NORMAL
EDIFF=1e-4
EDIFFG=1e-3
ENCUT=600.0
NSW=0
ISIF=2
IBRION=2
POTIM=0.02
ISMEAR=1
SIGMA=0.05
```

## 3.2. Output files¶

These are stored in the folder `results1`

,
if this is a new calculation and `results2`

, `results3`

, if
the calculation has been restarted or run a few times), there will be a
separate `results*`

folder for each calculation.

Caution

When looking at space groups in the file Individuals, keep in mind that USPEX often underdetermines space group symmetries, because of finite precision of structure relaxation and relatively tight space group determination tolerances. You should visualize the predicted structures. To get the true space group symmetry, either increase symmetry tolerances (but this can be dangerous), or re-relax your structure with increased precision.

The subdirectory contains the following files:

`OUTPUT.txt`

– summarizes input variables, structures produced by USPEX, and their characteristics.`parameters.uspex`

— this is a copy of the file used in this calculation with some defaults explicitly writen, for your reference.`Individuals`

– gives details of all produced structures (energies, unit cell volumes, space groups, variation operators that were used to produce the structures, \(k\)-points mesh used to compute the structures’ final energy, degrees of order,*etc.*).`BESTindividuals`

gives this information for the best structures from each generation.`convex_hull`

— only for variable-composition calculations, this file gives all thermodynamically stable compositions, and their enthalpies (per atom).`gatheredPOSCARS`

— relaxed structures (in the VASP5 POSCAR format).`BESTgatheredPOSCARS`

— the same data for the best structure in each generation.`gatheredPOSCARS_unrelaxed`

— gives all structures produced by USPEX before relaxation.`enthalpies_complete.csv`

— gives the enthalpies for all structures in each stage of relaxation.`origin`

— shows which structures originated from which parents and through which variation operators.`goodStructures`

and`extended_convex_hull`

(for fixed- and variable-composition calculations correspondingly) report all of the different structures (details) in order of decreasing stability, starting from the most stable structure and ending with the least stable.`goodStructures_POSCARS`

and`extended_convex_hull_POSCARS`

(for fixed- and variable-composition calculations correspondingly) report all of the different structures (in the VASP5 POSCAR format) in order of decreasing stability, starting from the most stable structure and ending with the least stable.`*.uspex`

auxiliary files which complement`POSCARS`

files if needed. They contain information about molecular association of atoms as well as periodical boundary conditions.graphical files (

**.svg**) — for rapid visual assessment of the results:`Energy_vs_N.svg`

(`Fitness_vs_N.svg`

) — energy (fitness) as a function of structure number;`Energy_vs_Volume.svg`

— energy as a function of volume;`Variation-Operators.svg`

— energy of the child*vs.*parent(s); different operators are marked with different colors (this graph allows one to assess the performance of different variation operators) also show evolution of each operator’s strength.`E_series`

— correlation between energies from relaxation steps \(i\) and \(i+1\); helps to detect problems and improve structure relaxation.For variable compositions there is an additional graph

`extendedConvexHull.svg`

, which shows the enthalpy of formation as function of composition.