Loading experimental data with GlassPy
Author: Daniel R. Cassar
Introduction
GlassPy can load experimental data through its glasspy.data subpackage. Currently, SciGlass is the only available data source.
Basic Usage
The minimal example below loads SciGlass data into a pandas DataFrame using the default configuration, which includes most of the available data and metadata.
[1]:
from glasspy.data import SciGlass
source = SciGlass()
df = source.data
The first run may take a while, as GlassPy performs several computations to prepare the data. Subsequent runs will be significantly faster, since the data is cached locally on your machine.
[2]:
df
[2]:
| elements | ... | property | metadata | ||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| H | Li | Be | B | C | N | O | F | Na | Mg | ... | SurfaceTensionAboveTg | SurfaceTension1173K | SurfaceTension1473K | SurfaceTension1573K | SurfaceTension1673K | ChemicalAnalysis | Author | Year | NumberElements | NumberCompounds | |
| ID | |||||||||||||||||||||
| 20400020000 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.666667 | 0.0 | 0.000000 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Volarovich M.P. | 1936 | 2 | 1 |
| 20500020001 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.579213 | 0.0 | 0.196815 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Hoj J.W. | 1992 | 5 | 4 |
| 20500020002 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.580869 | 0.0 | 0.193449 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Hoj J.W. | 1992 | 5 | 4 |
| 20500020003 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.581986 | 0.0 | 0.187167 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Hoj J.W. | 1992 | 5 | 4 |
| 20500020004 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.583672 | 0.0 | 0.183080 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Hoj J.W. | 1992 | 5 | 4 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 4493300611694 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.625485 | 0.0 | 0.000000 | 0.049125 | ... | NaN | NaN | NaN | NaN | NaN | False | Murata T. | 2019 | 7 | 6 |
| 4493300611695 | 0.0 | 0.0 | 0.0 | 0.001948 | 0.0 | 0.0 | 0.637540 | 0.0 | 0.000000 | 0.009932 | ... | NaN | NaN | NaN | NaN | NaN | False | Murata T. | 2019 | 10 | 9 |
| 4493300611696 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.635921 | 0.0 | 0.000000 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Murata T. | 2019 | 8 | 7 |
| 4493300611697 | 0.0 | 0.0 | 0.0 | 0.014544 | 0.0 | 0.0 | 0.622226 | 0.0 | 0.035890 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Murata T. | 2019 | 9 | 8 |
| 4493300611698 | 0.0 | 0.0 | 0.0 | 0.041532 | 0.0 | 0.0 | 0.634462 | 0.0 | 0.000000 | 0.000487 | ... | NaN | NaN | NaN | NaN | NaN | False | Murata T. | 2019 | 7 | 6 |
283102 rows × 793 columns
To avoid naming conflicts and simplify navigation, the DataFrame is organized into two levels. The first level groups information by composition, property, or metadata.
[3]:
print(df.columns.levels[0])
Index(['elements', 'compounds', 'property', 'metadata'], dtype='str')
To explore the chemical composition data, simply filter the DataFrame by the compounds or elements level.
[4]:
els = df["elements"]
els
[4]:
| H | Li | Be | B | C | N | O | F | Na | Mg | ... | W | Re | Pt | Au | Hg | Tl | Pb | Bi | Th | U | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ID | |||||||||||||||||||||
| 20400020000 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.666667 | 0.0 | 0.000000 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 20500020001 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.579213 | 0.0 | 0.196815 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 20500020002 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.580869 | 0.0 | 0.193449 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 20500020003 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.581986 | 0.0 | 0.187167 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 20500020004 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.583672 | 0.0 | 0.183080 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 4493300611694 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.625485 | 0.0 | 0.000000 | 0.049125 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 4493300611695 | 0.0 | 0.0 | 0.0 | 0.001948 | 0.0 | 0.0 | 0.637540 | 0.0 | 0.000000 | 0.009932 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 4493300611696 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.635921 | 0.0 | 0.000000 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 4493300611697 | 0.0 | 0.0 | 0.0 | 0.014544 | 0.0 | 0.0 | 0.622226 | 0.0 | 0.035890 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
| 4493300611698 | 0.0 | 0.0 | 0.0 | 0.041532 | 0.0 | 0.0 | 0.634462 | 0.0 | 0.000000 | 0.000487 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
283102 rows × 76 columns
The example below shows how to retrieve \(T_g\) data from the property level.
[5]:
Tg = df["property"]["Tg"]
Tg
[5]:
ID
20400020000 NaN
20500020001 1017.15
20500020002 1096.15
20500020003 1013.15
20500020004 1013.15
...
4493300611694 NaN
4493300611695 NaN
4493300611696 NaN
4493300611697 NaN
4493300611698 NaN
Name: Tg, Length: 283102, dtype: float64
As you can see, not all entries have a value for \(T_g\).
To check for all available properties in GlassPy, run:
[6]:
print(SciGlass.available_properties())
['T0', 'T1', 'T2', 'T3', 'T4', 'T5', 'T6', 'T7', 'T8', 'T9', 'T10', 'T11', 'T12', 'Viscosity773K', 'Viscosity873K', 'Viscosity973K', 'Viscosity1073K', 'Viscosity1173K', 'Viscosity1273K', 'Viscosity1373K', 'Viscosity1473K', 'Viscosity1573K', 'Viscosity1673K', 'Viscosity1773K', 'Viscosity1873K', 'Viscosity2073K', 'Viscosity2273K', 'Viscosity2473K', 'Tg', 'Tmelt', 'Tliquidus', 'TLittletons', 'TAnnealing', 'Tstrain', 'Tsoft', 'TdilatometricSoftening', 'AbbeNum', 'RefractiveIndex', 'RefractiveIndexLow', 'RefractiveIndexHigh', 'MeanDispersion', 'Permittivity', 'TangentOfLossAngle', 'TresistivityIs1MOhm.m', 'Resistivity293K', 'Resistivity373K', 'Resistivity423K', 'Resistivity573K', 'Resistivity1073K', 'Resistivity1273K', 'Resistivity1473K', 'Resistivity1673K', 'YoungModulus', 'ShearModulus', 'Microhardness', 'PoissonRatio', 'Density293K', 'Density1073K', 'Density1273K', 'Density1473K', 'Density1673K', 'ThermalConductivity', 'ThermalShockRes', 'CTEbelowTg', 'CTE328K', 'CTE373K', 'CTE433K', 'CTE483K', 'CTE623K', 'Cp293K', 'Cp473K', 'Cp673K', 'Cp1073K', 'Cp1273K', 'Cp1473K', 'Cp1673K', 'NucleationTemperature', 'NucleationRate', 'TMaxGrowthVelocity', 'MaxGrowthVelocity', 'CrystallizationPeak', 'CrystallizationOnset', 'SurfaceTensionAboveTg', 'SurfaceTension1173K', 'SurfaceTension1473K', 'SurfaceTension1573K', 'SurfaceTension1673K']
If you are unfamiliar with pandas DataFrames, refer to the pandas documentation.
Controlling the Initial Data Load
Loading the complete SciGlass dataset can be time-consuming, so it is advisable to load only the data you need. You can control what is loaded by passing configuration dictionaries to the SciGlass class.
For example, suppose you want to exclude glasses containing silver or gold, retrieve only glass transition temperature data, and omit compound information. You can do so as follows:
[7]:
all_properties_except_Tg = SciGlass.available_properties()
all_properties_except_Tg.remove("Tg")
config_el = {
"drop": ["Ag", "Au"],
}
config_prop = {
"keep": ["Tg"],
"drop": all_properties_except_Tg,
}
config_comp = {}
source = SciGlass(
elements_cfg=config_el,
properties_cfg=config_prop,
compounds_cfg=config_comp,
)
df = source.data
df
[7]:
| elements | property | metadata | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| H | Li | Be | B | C | N | O | F | Na | Mg | ... | Tl | Pb | Bi | Th | U | Tg | ChemicalAnalysis | Author | Year | NumberElements | |
| ID | |||||||||||||||||||||
| 20500020001 | 0.0 | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 57.921249 | 0.0 | 19.681530 | 0.0 | ... | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 1017.15 | False | Hoj J.W. | 1992 | 5 |
| 20500020002 | 0.0 | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 58.086941 | 0.0 | 19.344940 | 0.0 | ... | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 1096.15 | False | Hoj J.W. | 1992 | 5 |
| 20500020003 | 0.0 | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 58.198601 | 0.0 | 18.716690 | 0.0 | ... | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 1013.15 | False | Hoj J.W. | 1992 | 5 |
| 20500020004 | 0.0 | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 58.367241 | 0.0 | 18.308001 | 0.0 | ... | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 1013.15 | False | Hoj J.W. | 1992 | 5 |
| 20500020005 | 0.0 | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 58.282768 | 0.0 | 18.264561 | 0.0 | ... | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 978.15 | False | Hoj J.W. | 1992 | 5 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 4493200611415 | 0.0 | 7.250638 | 0.0 | 2.368801 | 0.0 | 0.0 | 59.389221 | 0.0 | 0.000000 | 0.0 | ... | 8.964828 | 0.0 | 5.536447 | 0.0 | 0.0 | 543.15 | False | Jung Woo Man | 2019 | 9 |
| 4493200611416 | 0.0 | 7.445931 | 0.0 | 2.358826 | 0.0 | 0.0 | 59.595871 | 0.0 | 0.000000 | 0.0 | ... | 6.650183 | 0.0 | 5.808963 | 0.0 | 0.0 | 545.15 | False | Jung Woo Man | 2019 | 9 |
| 4493200611417 | 0.0 | 6.593068 | 0.0 | 10.288480 | 0.0 | 0.0 | 59.600090 | 0.0 | 0.000000 | 0.0 | ... | 10.782570 | 0.0 | 0.000000 | 0.0 | 0.0 | 532.15 | False | Jung Woo Man | 2019 | 9 |
| 4493200611418 | 0.0 | 5.919064 | 0.0 | 1.936039 | 0.0 | 0.0 | 64.014076 | 0.0 | 0.000000 | 0.0 | ... | 7.322553 | 0.0 | 0.000000 | 0.0 | 0.0 | 506.15 | False | Jung Woo Man | 2019 | 9 |
| 4493200611419 | 0.0 | 6.371798 | 0.0 | 2.019926 | 0.0 | 0.0 | 63.761761 | 0.0 | 0.000000 | 0.0 | ... | 7.882636 | 0.0 | 0.000000 | 0.0 | 0.0 | 522.15 | False | Jung Woo Man | 2019 | 9 |
91738 rows × 78 columns
See the documentation for the SciGlass class for more information on how to control your initial data collection.