Skip to main content

Posts

Showing posts from July, 2020

Dimensionality and High Dimensional data in Machine Learning

Dimensionality and High Dimensional data in Machine Learning Dimensionality In Machine Learning and Data Science world dimensionality refers to the number of attributes a dataset has. For example, we have a telecommunication dataset having large numbers of attributes (region, tenure, age, address, etc). Each attribute is written in a CSV file, with each column representing each dimension. Dimensionality in machine learning is different from those which are used in mathematics or science. High Dimensional Data High Dimensional data means, the dataset in which the number of features exceeded the number of observations. The dataset has extremely high attributes and makes it more complex for computations. For example, we have ‘n’ numbers of observations or data points and ‘p’ no. of features or attributes. If in a dataset the values of n and p are 1000 and 2000 then the data becomes high dimensional data. In simple words no matter how big or small is the dataset if the number of observatio