load_wisconsin_breast_cancer_data#
- QuadratiK.datasets.load_wisconsin_breast_cancer_data(desc: bool = False, return_X_y: bool = False, as_dataframe: bool = True, scaled: bool = False) tuple[str, DataFrame, DataFrame] | tuple[str, DataFrame] | tuple[str, ndarray] | tuple[DataFrame, DataFrame] | tuple[ndarray, ndarray] | DataFrame | ndarray#
The Wisconsin breast cancer dataset data frame has 569 rows and 31 columns. The first 30 variables report the features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. The last column indicates the class labels (Benign = 0 or Malignant = 1).
The function load_wisconsin_breast_cancer_data loads the Breast Cancer Wisconsin (Diagnostic).
Read more in the User Guide.
Parameters#
- descboolean, optional
If set to True, the function will return the description along with the data. If set to False, the description will not be included. Defaults to False.
- return_X_yboolean, optional
Determines whether the function should return the data as separate arrays (X and y). Defaults to False.
- as_dataframeboolean, optional
Determines whether the function should return the data as a pandas DataFrame (True) or as a numpy array (False). Defaults to True.
- scaledboolean, optional
Determines whether or not the data should be scaled. If set to True, the data will be divided by its Euclidean norm along each row. Defaults to False.
Returns#
- If desc=True, return_X_y=True, as_dataframe=True:
Returns a tuple containing: (str, pd.DataFrame, pd.DataFrame)
- fdescrstr
The description of the dataset.
- Xpd.DataFrame
A DataFrame with the features.
- ypd.DataFrame
A DataFrame with the class labels.
- If desc=True, return_X_y=True, as_dataframe=False:
Returns a tuple containing: (str, np.ndarray, np.ndarray)
- fdescrstr
The description of the dataset.
- Xnp.ndarray
A numpy array with the features .
- ynp.ndarray
A numpy array with the class labels .
- If desc=True, return_X_y=False, as_dataframe=True:
Returns a tuple containing: (str, pd.DataFrame)
- fdescrstr
The description of the dataset.
- data_dfpd.DataFrame
A DataFrame containing the entire dataset.
- If desc=True, return_X_y=False, as_dataframe=False:
Returns a tuple containing: (str, np.ndarray)
- fdescrstr
The description of the dataset.
- datanp.ndarray
A numpy array containing the entire dataset.
- If desc=False, return_X_y=True, as_dataframe=True:
Returns a tuple containing: (pd.DataFrame, pd.DataFrame)
- Xpd.DataFrame
A DataFrame with the features.
- ypd.DataFrame
A DataFrame with the class labels.
- If desc=False, return_X_y=True, as_dataframe=False:
Returns a tuple containing: (np.ndarray, np.ndarray)
- Xnp.ndarray
A numpy array with the features.
- ynp.ndarray
A numpy array with the class labels.
- If desc=False, return_X_y=False, as_dataframe=True:
Returns: pd.DataFrame
- data_dfpd.DataFrame
A DataFrame containing the entire dataset.
- If desc=False, return_X_y=False, as_dataframe=False:
Returns: np.ndarray
- datanp.ndarray
A numpy array containing the entire dataset.
References#
Street, W. N., Wolberg, W. H., & Mangasarian, O. L. (1993, July). Nuclear feature extraction for breast tumor diagnosis. In Biomedical image processing and biomedical visualization (Vol. 1905, pp. 861-870). SPIE.
Source#
Wolberg, W., Mangasarian, O., Street, N., & Street, W. (1993). Breast Cancer Wisconsin (Diagnostic) [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5DW2B.
Examples#
from QuadratiK.datasets import load_wisconsin_breast_cancer_data X, y = load_wisconsin_breast_cancer_data(return_X_y=True) print(X.head())
radius1 texture1 perimeter1 area1 smoothness1 compactness1 \ 0 17.99 10.38 122.80 1001.0 0.11840 0.27760 1 20.57 17.77 132.90 1326.0 0.08474 0.07864 2 19.69 21.25 130.00 1203.0 0.10960 0.15990 3 11.42 20.38 77.58 386.1 0.14250 0.28390 4 20.29 14.34 135.10 1297.0 0.10030 0.13280 concavity1 concave_points1 symmetry1 fractal_dimension1 ... radius3 \ 0 0.3001 0.14710 0.2419 0.07871 ... 25.38 1 0.0869 0.07017 0.1812 0.05667 ... 24.99 2 0.1974 0.12790 0.2069 0.05999 ... 23.57 3 0.2414 0.10520 0.2597 0.09744 ... 14.91 4 0.1980 0.10430 0.1809 0.05883 ... 22.54 texture3 perimeter3 area3 smoothness3 compactness3 concavity3 \ 0 17.33 184.60 2019.0 0.1622 0.6656 0.7119 1 23.41 158.80 1956.0 0.1238 0.1866 0.2416 2 25.53 152.50 1709.0 0.1444 0.4245 0.4504 3 26.50 98.87 567.7 0.2098 0.8663 0.6869 4 16.67 152.20 1575.0 0.1374 0.2050 0.4000 concave_points3 symmetry3 fractal_dimension3 0 0.2654 0.4601 0.11890 1 0.1860 0.2750 0.08902 2 0.2430 0.3613 0.08758 3 0.2575 0.6638 0.17300 4 0.1625 0.2364 0.07678 [5 rows x 30 columns]