3. Overview of input and output files¶
Input/output files depend on the external code used for structure relaxation.
An important technical element of our philosophy is the multi-stage strategy for structure relaxation. Final structures and energies must be high-quality, in order to correctly drive evolution. Most of the newly generated structures are far from local minimum and their high-quality relaxation is extremely expensive. This cost can be offset if the first stages of relaxation are done with cruder computational conditions — only at the last stages is there a need for high-quality calculations. The first stages of structure relaxation can be performed with cheaper approaches or lower computational conditions (basis set, k-points sampling, pseudopotentials) or level of approximation (forcefields vs. LDA vs. GGA) and even different structure relaxation code (see Section 2.5 for a list of supported codes) during structure relaxation of each candidate structure. We strongly suggest you initially optimize the cell shape and atomic positions at constant unit cell volume, and only then perform full optimization of all structural variables. While optimizing at constant volume, you do not need to worry about Pulay stresses in plane-wave calculations — it is OK to use a small basis set; however, for variable-cell relaxation you will need a high-quality basis set. For structure relaxation, you can often get away with a small set of k-points — but don’t forget to sufficiently increase this at the last stage(s) of structure relaxation, to get accurate energies. Use your (and our) wisdom, be a strategist, and remember that poor relaxation can ruin your results.
3.1. Input files¶
Suppose that the directory where the calculations are performed is ~/StructurePrediction
.
This directory will contain:
file
input.uspex
, thoroughly described in Section 5.Subdirectory
~/StructurePrediction/Specific/
with VASP, GULP, etc. executables, and enumerated input files for structure relaxation —INCAR_1
,INCAR_2
, …, and pseudopotentials. You can actually alter this filenames (see Section 5.8)Subdirectory
~/StructurePrediction/Seeds/
contains files with seed structures. Seed structures should be in VASP5 POSCAR format grouped into folders. Which folder will be used for which generation is specified in Section 5.6.14.Files with molecule definitions (see Section 5.10).
Files with environment definitions (see Section 5.11).
3.1.1. Specific
folder¶
Executables and enumerated input files for structure relaxation (using external codes, like VASP, GULP, …) should be put in subdirectory
For VASP, put files
INCAR_1
,INCAR_2
, …, etc., defining how relaxation and energy calculations will be performed at each stage of relaxation (we recommend at least 3 stages of relaxation), and the correspondingPOTCAR_*
files with pseudopotentials. E.g.,INCAR_1
andINCAR_2
perform very crude structure relaxation of both atomic positions and cell parameters, keeping the volume fixed,INCAR_3
performs full structure relaxation under constant pressure with medium precision,INCAR_4
performs very accurate calculations. Each higher-level structure relaxation starts from the results of a lower-level optimization and improves them. files of all relevant elements should also be inSpecific
folder, for instancePOTCAR_O
,POTCAR_C
, etc.For GULP, files
goptions_1
,goptions_2
, … and ,ginput_1
,ginput_2
, … must be present. The former specify what kind of optimization is performed, the latter specify the details (interatomic potentials, pressure, temperature, number of relaxation iterations, etc.).For Quantum Espresso, files
qEspresso options 1
,qEspresso options 1
, …, must be present. All files should be the normal QE input files with all parameters except atom coordinates, cell parameters and \(k\)-points (these will be written by USPEX at the end of the file). We recommend performing a multi-step relaxation. For instanceqEspresso options 1
, does a crude structure relaxation of atomic positions with fixed cell parameters,qEspresso options 2
does full structure relaxation under constant external pressure with medium precision;qEspresso options 3
and does very accurate calculations.
3.1.1.1. INCAR_*
files in Specific/
folder for VASP¶
To use USPEX correctly, you should carefully edit the files in Specific/
folder
to control the structure relaxation in USPEX. We take example of VASP as an
external code:
Your final structures have to be well relaxed, and energies — precise. The point is that your energy ranking has to be correct (to check this, look at
E_series.pdf
file in the output).Your
POTCAR
files: To yield correct results, the cores of your pseudopotentials (or PAW potentials) should not overlap by more than 10–15%.To have accurate relaxation at low cost, use the multistage relaxation with at least three stages of relaxation for each structure, i.e. at least three
INCAR
files (INCAR_1
,INCAR_2
,INCAR_3
, …). We usually set 4–5 stages of relaxation.Your initial structures will be usually very far from local minima, in such cases it helps to relax atoms and cell shape at constant volume first (
ISIF
= 4 inINCAR_1,2
), then do full relaxation (ISIF
= 3 inINCAR_3,4
), and finish with a very accurate single-point calculation (ISIF
= 2 andNSW
= 0 inINCAR_5
).Exceptions: when you do fixed-cell predictions, and also in evolutionary metadynamics (except full relaxation) you must have
ISIF
= 2.When your volume does not change, you can use default plane wave cutoff. When you optimize cell voluem (
ISIF
= 3), you must increase it by 30–40%, otherwise you get a large Pulay stress. Also your convergence criteria can be loose in the beginning, but have to be tight in the end: e.g.,EDIFF
= 1e-2 andEDIFFG
= 1e-1 inINCAR_1
, gradually tightening toEDIFF
= 1e-4 andEDIFFG
= 1e-3 inINCAR_4
. The maximum number of iterations (NSW
) should be sufficiently large to enable good relaxation, but not too large to avoid wasting computer time on poor configurations. The larger your system, the largerNSW
should be.Choosing an efficient relaxation algorithm can save a lot of time. In VASP, we recommend to start relaxation with conjugate gradients (
IBRION
= 2 andPOTIM
= 0.02) and when the structure is closer to local minimum, switch toIBRION
= 1 andPOTIM
= 0.3.Even if you study an insulating system, many configurations that you will sample are going to be metallic, so to have well-converged results, you must use “metallic” treatment — which works both for metals and insulators. We recommend the Methfessel-Paxton smearing scheme (
ISMEAR
= 1). For a clearly metallic system, useISMEAR
= 1 andSIGMA
= 0.1–0.2. For a clearly insulating system, we recommendISMEAR
= 1 andSIGMA
starting at 0.1 (INCAR_1
) and decreasing to 0.05.
Here we provide an example of files for carbon with 16 atoms in the unit
cell, with default ENCUT
= 400 eV in POTCAR
:
INCAR_1:
PREC=LOW
EDIFF=1e-2
EDIFFG=1e-1
NSW=65
ISIF=4
IBRION=2
POTIM=0.02
ISMEAR=1
SIGMA=0.10
INCAR_2:
PREC=NORMAL
EDIFF=1e-3
EDIFFG=1e-2
NSW=55
ISIF=4
IBRION=1
POTIM=0.30
ISMEAR=1
SIGMA=0.08
INCAR_3:
PREC=NORMAL
EDIFF=1e-3
EDIFFG=1e-2
ENCUT=520.0
NSW=65
ISIF=3
IBRION=2
POTIM=0.02
ISMEAR=1
SIGMA=0.07
INCAR_4:
PREC=NORMAL
EDIFF=1e-4
EDIFFG=1e-3
ENCUT=600.0
NSW=55
ISIF=3
IBRION=1
POTIM=0.30
ISMEAR=1
SIGMA=0.06
INCAR_5:
PREC=NORMAL
EDIFF=1e-4
EDIFFG=1e-3
ENCUT=600.0
NSW=0
ISIF=2
IBRION=2
POTIM=0.02
ISMEAR=1
SIGMA=0.05
3.2. Output files¶
These are stored in the folder results1
,
if this is a new calculation and results2
, results3
, if
the calculation has been restarted or run a few times), there will be a
separate results*
folder for each calculation.
Caution
When looking at space groups in the file Individuals, keep in mind that USPEX often underdetermines space group symmetries, because of finite precision of structure relaxation and relatively tight space group determination tolerances. You should visualize the predicted structures. To get the true space group symmetry, either increase symmetry tolerances (but this can be dangerous), or re-relax your structure with increased precision.
The subdirectory contains the following files:
OUTPUT.txt
– summarizes input variables, structures produced by USPEX, and their characteristics.parameters.uspex
— this is a copy of the file used in this calculation with some defaults explicitly writen, for your reference.Individuals
– gives details of all produced structures (energies, unit cell volumes, space groups, variation operators that were used to produce the structures, \(k\)-points mesh used to compute the structures’ final energy, degrees of order, etc.).BESTindividuals
gives this information for the best structures from each generation.convex_hull
— only for variable-composition calculations, this file gives all thermodynamically stable compositions, and their enthalpies (per atom).gatheredPOSCARS
— relaxed structures (in the VASP5 POSCAR format).BESTgatheredPOSCARS
— the same data for the best structure in each generation.gatheredPOSCARS_unrelaxed
— gives all structures produced by USPEX before relaxation.enthalpies_complete.csv
— gives the enthalpies for all structures in each stage of relaxation.origin
— shows which structures originated from which parents and through which variation operators.goodStructures
andextended_convex_hull
(for fixed- and variable-composition calculations correspondingly) report all of the different structures (details) in order of decreasing stability, starting from the most stable structure and ending with the least stable.goodStructures_POSCARS
andextended_convex_hull_POSCARS
(for fixed- and variable-composition calculations correspondingly) report all of the different structures (in the VASP5 POSCAR format) in order of decreasing stability, starting from the most stable structure and ending with the least stable.*.uspex
auxiliary files which complementPOSCARS
files if needed. They contain information about molecular association of atoms as well as periodical boundary conditions.graphical files (.svg) — for rapid visual assessment of the results:
Energy_vs_N.svg
(Fitness_vs_N.svg
) — energy (fitness) as a function of structure number;Energy_vs_Volume.svg
— energy as a function of volume;Variation-Operators.svg
— energy of the child vs. parent(s); different operators are marked with different colors (this graph allows one to assess the performance of different variation operators) also show evolution of each operator’s strength.E_series
— correlation between energies from relaxation steps \(i\) and \(i+1\); helps to detect problems and improve structure relaxation.For variable compositions there is an additional graph
extendedConvexHull.svg
, which shows the enthalpy of formation as function of composition.