Loading experimental data with GlassPy

Author: Daniel R. Cassar

Introduction

GlassPy can load experimental data through its glasspy.data subpackage. Currently, SciGlass is the only available data source.

Basic Usage

The minimal example below loads SciGlass data into a pandas DataFrame using the default configuration, which includes most of the available data and metadata.

[1]:
from glasspy.data import SciGlass

source = SciGlass()
df = source.data

The first run may take a while, as GlassPy performs several computations to prepare the data. Subsequent runs will be significantly faster, since the data is cached locally on your machine.

[2]:
df
[2]:
elements ... property metadata
H Li Be B C N O F Na Mg ... SurfaceTensionAboveTg SurfaceTension1173K SurfaceTension1473K SurfaceTension1573K SurfaceTension1673K ChemicalAnalysis Author Year NumberElements NumberCompounds
ID
20400020000 0.0 0.0 0.0 0.000000 0.0 0.0 0.666667 0.0 0.000000 0.000000 ... NaN NaN NaN NaN NaN False Volarovich M.P. 1936 2 1
20500020001 0.0 0.0 0.0 0.000000 0.0 0.0 0.579213 0.0 0.196815 0.000000 ... NaN NaN NaN NaN NaN False Hoj J.W. 1992 5 4
20500020002 0.0 0.0 0.0 0.000000 0.0 0.0 0.580869 0.0 0.193449 0.000000 ... NaN NaN NaN NaN NaN False Hoj J.W. 1992 5 4
20500020003 0.0 0.0 0.0 0.000000 0.0 0.0 0.581986 0.0 0.187167 0.000000 ... NaN NaN NaN NaN NaN False Hoj J.W. 1992 5 4
20500020004 0.0 0.0 0.0 0.000000 0.0 0.0 0.583672 0.0 0.183080 0.000000 ... NaN NaN NaN NaN NaN False Hoj J.W. 1992 5 4
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
4493300611694 0.0 0.0 0.0 0.000000 0.0 0.0 0.625485 0.0 0.000000 0.049125 ... NaN NaN NaN NaN NaN False Murata T. 2019 7 6
4493300611695 0.0 0.0 0.0 0.001948 0.0 0.0 0.637540 0.0 0.000000 0.009932 ... NaN NaN NaN NaN NaN False Murata T. 2019 10 9
4493300611696 0.0 0.0 0.0 0.000000 0.0 0.0 0.635921 0.0 0.000000 0.000000 ... NaN NaN NaN NaN NaN False Murata T. 2019 8 7
4493300611697 0.0 0.0 0.0 0.014544 0.0 0.0 0.622226 0.0 0.035890 0.000000 ... NaN NaN NaN NaN NaN False Murata T. 2019 9 8
4493300611698 0.0 0.0 0.0 0.041532 0.0 0.0 0.634462 0.0 0.000000 0.000487 ... NaN NaN NaN NaN NaN False Murata T. 2019 7 6

283102 rows × 793 columns

To avoid naming conflicts and simplify navigation, the DataFrame is organized into two levels. The first level groups information by composition, property, or metadata.

[3]:
print(df.columns.levels[0])
Index(['elements', 'compounds', 'property', 'metadata'], dtype='str')

To explore the chemical composition data, simply filter the DataFrame by the compounds or elements level.

[4]:
els = df["elements"]

els
[4]:
H Li Be B C N O F Na Mg ... W Re Pt Au Hg Tl Pb Bi Th U
ID
20400020000 0.0 0.0 0.0 0.000000 0.0 0.0 0.666667 0.0 0.000000 0.000000 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20500020001 0.0 0.0 0.0 0.000000 0.0 0.0 0.579213 0.0 0.196815 0.000000 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20500020002 0.0 0.0 0.0 0.000000 0.0 0.0 0.580869 0.0 0.193449 0.000000 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20500020003 0.0 0.0 0.0 0.000000 0.0 0.0 0.581986 0.0 0.187167 0.000000 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
20500020004 0.0 0.0 0.0 0.000000 0.0 0.0 0.583672 0.0 0.183080 0.000000 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
4493300611694 0.0 0.0 0.0 0.000000 0.0 0.0 0.625485 0.0 0.000000 0.049125 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4493300611695 0.0 0.0 0.0 0.001948 0.0 0.0 0.637540 0.0 0.000000 0.009932 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4493300611696 0.0 0.0 0.0 0.000000 0.0 0.0 0.635921 0.0 0.000000 0.000000 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4493300611697 0.0 0.0 0.0 0.014544 0.0 0.0 0.622226 0.0 0.035890 0.000000 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
4493300611698 0.0 0.0 0.0 0.041532 0.0 0.0 0.634462 0.0 0.000000 0.000487 ... 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

283102 rows × 76 columns

The example below shows how to retrieve \(T_g\) data from the property level.

[5]:
Tg = df["property"]["Tg"]

Tg
[5]:
ID
20400020000          NaN
20500020001      1017.15
20500020002      1096.15
20500020003      1013.15
20500020004      1013.15
                  ...
4493300611694        NaN
4493300611695        NaN
4493300611696        NaN
4493300611697        NaN
4493300611698        NaN
Name: Tg, Length: 283102, dtype: float64

As you can see, not all entries have a value for \(T_g\).

To check for all available properties in GlassPy, run:

[6]:
print(SciGlass.available_properties())
['T0', 'T1', 'T2', 'T3', 'T4', 'T5', 'T6', 'T7', 'T8', 'T9', 'T10', 'T11', 'T12', 'Viscosity773K', 'Viscosity873K', 'Viscosity973K', 'Viscosity1073K', 'Viscosity1173K', 'Viscosity1273K', 'Viscosity1373K', 'Viscosity1473K', 'Viscosity1573K', 'Viscosity1673K', 'Viscosity1773K', 'Viscosity1873K', 'Viscosity2073K', 'Viscosity2273K', 'Viscosity2473K', 'Tg', 'Tmelt', 'Tliquidus', 'TLittletons', 'TAnnealing', 'Tstrain', 'Tsoft', 'TdilatometricSoftening', 'AbbeNum', 'RefractiveIndex', 'RefractiveIndexLow', 'RefractiveIndexHigh', 'MeanDispersion', 'Permittivity', 'TangentOfLossAngle', 'TresistivityIs1MOhm.m', 'Resistivity293K', 'Resistivity373K', 'Resistivity423K', 'Resistivity573K', 'Resistivity1073K', 'Resistivity1273K', 'Resistivity1473K', 'Resistivity1673K', 'YoungModulus', 'ShearModulus', 'Microhardness', 'PoissonRatio', 'Density293K', 'Density1073K', 'Density1273K', 'Density1473K', 'Density1673K', 'ThermalConductivity', 'ThermalShockRes', 'CTEbelowTg', 'CTE328K', 'CTE373K', 'CTE433K', 'CTE483K', 'CTE623K', 'Cp293K', 'Cp473K', 'Cp673K', 'Cp1073K', 'Cp1273K', 'Cp1473K', 'Cp1673K', 'NucleationTemperature', 'NucleationRate', 'TMaxGrowthVelocity', 'MaxGrowthVelocity', 'CrystallizationPeak', 'CrystallizationOnset', 'SurfaceTensionAboveTg', 'SurfaceTension1173K', 'SurfaceTension1473K', 'SurfaceTension1573K', 'SurfaceTension1673K']

If you are unfamiliar with pandas DataFrames, refer to the pandas documentation.

Controlling the Initial Data Load

Loading the complete SciGlass dataset can be time-consuming, so it is advisable to load only the data you need. You can control what is loaded by passing configuration dictionaries to the SciGlass class.

For example, suppose you want to exclude glasses containing silver or gold, retrieve only glass transition temperature data, and omit compound information. You can do so as follows:

[7]:
all_properties_except_Tg = SciGlass.available_properties()
all_properties_except_Tg.remove("Tg")

config_el = {
    "drop": ["Ag", "Au"],
}

config_prop = {
    "keep": ["Tg"],
    "drop": all_properties_except_Tg,
}

config_comp = {}

source = SciGlass(
    elements_cfg=config_el,
    properties_cfg=config_prop,
    compounds_cfg=config_comp,
)

df = source.data
df
[7]:
elements property metadata
H Li Be B C N O F Na Mg ... Tl Pb Bi Th U Tg ChemicalAnalysis Author Year NumberElements
ID
20500020001 0.0 0.000000 0.0 0.000000 0.0 0.0 57.921249 0.0 19.681530 0.0 ... 0.000000 0.0 0.000000 0.0 0.0 1017.15 False Hoj J.W. 1992 5
20500020002 0.0 0.000000 0.0 0.000000 0.0 0.0 58.086941 0.0 19.344940 0.0 ... 0.000000 0.0 0.000000 0.0 0.0 1096.15 False Hoj J.W. 1992 5
20500020003 0.0 0.000000 0.0 0.000000 0.0 0.0 58.198601 0.0 18.716690 0.0 ... 0.000000 0.0 0.000000 0.0 0.0 1013.15 False Hoj J.W. 1992 5
20500020004 0.0 0.000000 0.0 0.000000 0.0 0.0 58.367241 0.0 18.308001 0.0 ... 0.000000 0.0 0.000000 0.0 0.0 1013.15 False Hoj J.W. 1992 5
20500020005 0.0 0.000000 0.0 0.000000 0.0 0.0 58.282768 0.0 18.264561 0.0 ... 0.000000 0.0 0.000000 0.0 0.0 978.15 False Hoj J.W. 1992 5
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
4493200611415 0.0 7.250638 0.0 2.368801 0.0 0.0 59.389221 0.0 0.000000 0.0 ... 8.964828 0.0 5.536447 0.0 0.0 543.15 False Jung Woo Man 2019 9
4493200611416 0.0 7.445931 0.0 2.358826 0.0 0.0 59.595871 0.0 0.000000 0.0 ... 6.650183 0.0 5.808963 0.0 0.0 545.15 False Jung Woo Man 2019 9
4493200611417 0.0 6.593068 0.0 10.288480 0.0 0.0 59.600090 0.0 0.000000 0.0 ... 10.782570 0.0 0.000000 0.0 0.0 532.15 False Jung Woo Man 2019 9
4493200611418 0.0 5.919064 0.0 1.936039 0.0 0.0 64.014076 0.0 0.000000 0.0 ... 7.322553 0.0 0.000000 0.0 0.0 506.15 False Jung Woo Man 2019 9
4493200611419 0.0 6.371798 0.0 2.019926 0.0 0.0 63.761761 0.0 0.000000 0.0 ... 7.882636 0.0 0.000000 0.0 0.0 522.15 False Jung Woo Man 2019 9

91738 rows × 78 columns

See the documentation for the SciGlass class for more information on how to control your initial data collection.