Redundancy Operations in Datasets using Graphs and Shapley Values

27th May, 2025 | Research blog

Author

Guru Ganesan

Guru’s latest Hub research is concentrating on redundancy in datasets.  This Blog is an extract from a recent paper which describes redundancy in datasets as an important object of study from both theoretical and application perspectives.

Depending on the application, redundancy may either be useful or wasteful in terms of performance. Index redundancy has been studied mainly from the context of the class imbalance problem, where the minority class event rarely occurs and as a result few data is available confirming its occurrence. This in turn hinders the accuracy performance of predictive models.

Read the full blog here

Contact Guru Ganesan