.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/exploratory/plot_explore.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_exploratory_plot_explore.py: Exploring data ============== Explores the Tecator data set by plotting the functional data and calculating means and derivatives. .. GENERATED FROM PYTHON SOURCE LINES 8-12 .. code-block:: Python # Author: Miguel Carbajo Berrocal # License: MIT .. GENERATED FROM PYTHON SOURCE LINES 13-20 In this example we are going to explore the functional properties of the :func:`Tecator ` dataset. This dataset measures the infrared absorbance spectrum of meat samples. The objective is to predict the fat, water, and protein content of the samples. In this example we only want to discriminate between meat with less than 20% of fat, and meat with a higher fat content. .. GENERATED FROM PYTHON SOURCE LINES 20-28 .. code-block:: Python from skfda.datasets import fetch_tecator X, y = fetch_tecator(return_X_y=True, as_frame=True) fd = X.iloc[:, 0].array fat = y["fat"].to_numpy() .. GENERATED FROM PYTHON SOURCE LINES 34-36 We will now plot in red samples containing less than 20% of fat and in blue the rest. .. GENERATED FROM PYTHON SOURCE LINES 36-58 .. code-block:: Python import matplotlib.pyplot as plt import numpy as np fat_threshold_percent = 20 low_fat = fat < fat_threshold_percent labels = np.full(fd.n_samples, "high fat") labels[low_fat] = "low fat" colors = { "high fat": "red", "low fat": "blue", } fd.plot( group=labels, group_colors=colors, linewidth=0.5, alpha=0.7, legend=True, ) plt.show() .. image-sg:: /auto_examples/exploratory/images/sphx_glr_plot_explore_001.png :alt: Spectrometric curves :srcset: /auto_examples/exploratory/images/sphx_glr_plot_explore_001.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 59-60 The means of each group are the following ones. .. GENERATED FROM PYTHON SOURCE LINES 60-76 .. code-block:: Python from skfda.exploratory.stats import mean mean_low = mean(fd[low_fat]) mean_high = mean(fd[~low_fat]) means = mean_high.concatenate(mean_low) means.dataset_name = f"{fd.dataset_name} - means" means.plot( group=["high fat", "low fat"], group_colors=colors, linewidth=0.5, legend=True, ) plt.show() .. image-sg:: /auto_examples/exploratory/images/sphx_glr_plot_explore_002.png :alt: Spectrometric curves - means :srcset: /auto_examples/exploratory/images/sphx_glr_plot_explore_002.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 77-83 In this dataset, the vertical shift in the original trajectories is not very significative for predicting the fat content. However, the shape of the curve is very relevant. We can observe that looking at the first and second derivatives. The first derivative is shown below: .. GENERATED FROM PYTHON SOURCE LINES 83-95 .. code-block:: Python fdd = fd.derivative() fdd.dataset_name = f"{fd.dataset_name} - derivative" fdd.plot( group=labels, group_colors=colors, linewidth=0.5, alpha=0.7, legend=True, ) plt.show() .. image-sg:: /auto_examples/exploratory/images/sphx_glr_plot_explore_003.png :alt: Spectrometric curves - derivative :srcset: /auto_examples/exploratory/images/sphx_glr_plot_explore_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 96-97 We now show the second derivative: .. GENERATED FROM PYTHON SOURCE LINES 97-107 .. code-block:: Python fdd = fd.derivative(order=2) fdd.dataset_name = f"{fd.dataset_name} - second derivative" fdd.plot( group=labels, group_colors=colors, linewidth=0.5, alpha=0.7, legend=True, ) plt.show() .. image-sg:: /auto_examples/exploratory/images/sphx_glr_plot_explore_004.png :alt: Spectrometric curves - second derivative :srcset: /auto_examples/exploratory/images/sphx_glr_plot_explore_004.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.446 seconds) .. _sphx_glr_download_auto_examples_exploratory_plot_explore.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/GAA-UAM/scikit-fda/develop?filepath=examples/exploratory/plot_explore.py :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_explore.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_explore.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_explore.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_