load_wine_data#
- QuadratiK.datasets.load_wine_data(desc: bool = False, return_X_y: bool = False, as_dataframe: bool = True, scaled: bool = False) tuple[str, DataFrame, DataFrame] | tuple[str, DataFrame] | tuple[str, ndarray] | tuple[DataFrame, DataFrame] | tuple[ndarray, ndarray] | DataFrame | ndarray#
The wine data frame has 178 rows and 14 columns. The first 13 variables report 13 constituents found in each of the three types of wines. The last column indicates the class labels (1,2 or 3).
The function load_wine_data loads the Wine dataset.
Read more in the User Guide.
Parameters#
- descboolean, optional
If set to True, the function will return the description along with the data. If set to False, the description will not be included. Defaults to False.
- return_X_yboolean, optional
Determines whether the function should return the data as separate arrays (X and y). Defaults to False.
- as_dataframeboolean, optional
Determines whether the function should return the data as a pandas DataFrame (True) or as a numpy array (False). Defaults to True.
- scaledboolean, optional
Determines whether or not the data should be scaled. If set to True, the data will be divided by its Euclidean norm along each row. Defaults to False.
Returns#
- If desc=True, return_X_y=True, as_dataframe=True:
Returns a tuple containing: (str, pd.DataFrame, pd.DataFrame)
- fdescrstr
The description of the dataset.
- Xpd.DataFrame
A DataFrame with the features.
- ypd.DataFrame
A DataFrame with the class labels.
- If desc=True, return_X_y=True, as_dataframe=False:
Returns a tuple containing: (str, np.ndarray, np.ndarray)
- fdescrstr
The description of the dataset.
- Xnp.ndarray
A numpy array with the features .
- ynp.ndarray
A numpy array with the class labels .
- If desc=True, return_X_y=False, as_dataframe=True:
Returns a tuple containing: (str, pd.DataFrame)
- fdescrstr
The description of the dataset.
- data_dfpd.DataFrame
A DataFrame containing the entire dataset.
- If desc=True, return_X_y=False, as_dataframe=False:
Returns a tuple containing: (str, np.ndarray)
- fdescrstr
The description of the dataset.
- datanp.ndarray
A numpy array containing the entire dataset.
- If desc=False, return_X_y=True, as_dataframe=True:
Returns a tuple containing: (pd.DataFrame, pd.DataFrame)
- Xpd.DataFrame
A DataFrame with the features.
- ypd.DataFrame
A DataFrame with the class labels.
- If desc=False, return_X_y=True, as_dataframe=False:
Returns a tuple containing: (np.ndarray, np.ndarray)
- Xnp.ndarray
A numpy array with the features.
- ynp.ndarray
A numpy array with the class labels.
- If desc=False, return_X_y=False, as_dataframe=True:
Returns: pd.DataFrame
- data_dfpd.DataFrame
A DataFrame containing the entire dataset.
- If desc=False, return_X_y=False, as_dataframe=False:
Returns: np.ndarray
- datanp.ndarray
A numpy array containing the entire dataset.
References#
Aeberhard, S., Coomans, D., & De Vel, O. (1994). Comparative analysis of statistical pattern recognition methods in high dimensional settings. Pattern Recognition, 27(8), 1065-1077.
Source#
Aeberhard, S. & Forina, M. (1992). Wine [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5PC7J.
Examples#
from QuadratiK.datasets import load_wine_data X, y = load_wine_data(return_X_y=True) print(X.head())
Alcohol Malicacid Ash Alcalinity_of_ash Magnesium Total_phenols \ 0 14.23 1.71 2.43 15.6 127.0 2.80 1 13.20 1.78 2.14 11.2 100.0 2.65 2 13.16 2.36 2.67 18.6 101.0 2.80 3 14.37 1.95 2.50 16.8 113.0 3.85 4 13.24 2.59 2.87 21.0 118.0 2.80 Flavanoids Nonflavanoid_phenols Proanthocyanins Color_intensity Hue \ 0 3.06 0.28 2.29 5.64 1.04 1 2.76 0.26 1.28 4.38 1.05 2 3.24 0.30 2.81 5.68 1.03 3 3.49 0.24 2.18 7.80 0.86 4 2.69 0.39 1.82 4.32 1.04 0D280_0D315_of_diluted_wines Proline 0 3.92 1065.0 1 3.40 1050.0 2 3.17 1185.0 3 3.45 1480.0 4 2.93 735.0