Function Resource

ExampleDataset

Retrieve example data as a dataset

ResourceFunction["ExampleDataset"][arg]

returns the ExampleData collection specified by arg as a dataset.

Details and Options

ResourceFunction["ExampleDataset"] has one argument that should be a viable first argument for ExampleData. Only data collections in the "MachineLearning" and "Statistics" domains are supported.

Examples

Basic Examples (1) 

Get an example dataset:

In[1]:=
ResourceFunction[
 "https://www.wolframcloud.com/obj/antononcube/DeployedResources/\
Function/ExampleDataset"][{"Statistics", "AnimalWeights"}]
Out[1]=
In[2]:=
ResourceFunction[
 "https://www.wolframcloud.com/obj/antononcube/DeployedResources/\
Function/ExampleDataset"][{"MachineLearning", "WineQuality"}]
Out[2]=

Applications (2) 

Find homes from Boston with age greater than 98 years:

In[3]:=
ResourceFunction[
  "https://www.wolframcloud.com/obj/antononcube/DeployedResources/\
Function/ExampleDataset"][{"Statistics", "BostonHomes"}][
 Select[#AGE > 98 &]]
Out[3]=

Cross tabulate odor and edibility for mushrooms (we can see that odor is a good indicator of edibility):

In[4]:=
ResourceFunction["CrossTabulate"][
 ResourceFunction[
   "https://www.wolframcloud.com/obj/antononcube/DeployedResources/\
Function/ExampleDataset"][{"MachineLearning", "Mushroom"}][
  All, {#odor, Last[#]} &]]
Out[4]=

Possible Issues (3) 

If an unknown dataset name is specified then the result is Failure:

In[5]:=
ResourceFunction[
 "https://www.wolframcloud.com/obj/antononcube/DeployedResources/\
Function/ExampleDataset"][{"Statistics", "BlahBlah"}]
Out[5]=

The expected data types for ExampleData’s "Statistics" datasets are {“MultivariateSample”,”TimeSeries”,”EventData”}. ExampleData has the following data types:

In[6]:=
Union[ExampleData[#, "DataType"] & /@ ExampleData["Statistics"]]
Out[6]=

The expected data types for ExampleData's "Statistics" datasets are {“MultivariateSample”,”TimeSeries”,”EventData”}. Failure is returned for other data types. Here is an example:

In[7]:=
ResourceFunction[
 "https://www.wolframcloud.com/obj/antononcube/DeployedResources/\
Function/ExampleDataset"][{"Statistics", "ScientificDiscoveries"}]
Out[7]=

Here is a summary of the successes and failures of for different data types in the "Statistics" example data collection:

In[8]:=
Quiet[ResourceFunction["RecordsSummary"]@
  Map[ExampleData[#, "DataType"] -> Head[ResourceFunction[
       "https://www.wolframcloud.com/obj/antononcube/\
DeployedResources/Function/ExampleDataset"][#]] &, ExampleData["Statistics"]]]
Out[8]=

Some "MachineLearnining" example data have data shapes and variable names that do not match. In those cases ExampleDataset returns Failure:

In[9]:=
ResourceFunction[
 "https://www.wolframcloud.com/obj/antononcube/DeployedResources/\
Function/ExampleDataset"][{"MachineLearning", "BostonHomes"}]
Out[9]=

Compare the length of the variable names:

In[10]:=
Length[Flatten@
  Apply[List, ExampleData[{"MachineLearning", "BostonHomes"}, "VariableDescriptions"]]]
Out[10]=

with the dimensions of the data:

In[11]:=
Dimensions[
 Map[Flatten, List @@@ ExampleData[{"MachineLearning", "BostonHomes"}, "Data"]]]
Out[11]=

Here is an association that shows the successes and failures over the "MachineLearning" datasets:

In[12]:=
Association[# -> Head[ResourceFunction[
      "https://www.wolframcloud.com/obj/antononcube/DeployedResources/\
Function/ExampleDataset"][#]] & /@ ExampleData["MachineLearning"]]
Out[12]=

Neat Examples (1) 

Summaries for all "Statistics" datasets in ExampleData that have six columns:

In[13]:=
Block[{resAll},
 resAll = Quiet[Association@
    Map[# -> ResourceFunction[
        "https://www.wolframcloud.com/obj/antononcube/\
DeployedResources/Function/ExampleDataset"][#] &, ExampleData["Statistics"]]];
 ResourceFunction["RecordsSummary"] /@ Select[resAll, Head[#] === Dataset && Dimensions[#][[2]] == 6 &]
 ]
Out[13]=