Optimization and Horizontal Aggregation in SQL for Streamlined Data Mining

V. Anitha; Neeraj Sharma; Dr Vijay Kumar Burugari; Mrs Prathibha Ganapuram

PDF

Published: Dec 2, 2023

Keywords:

SQL Code generation, Initial data Analysis, Characteristics of data sample, Main data Analysis, Properties

V. Anitha

Assistant Professor, Department of CSE, B V Raju Institute of Technology, Narsapur, Telangana, India-502313

Neeraj Sharma

Professor, Information Technology, Vasant Dada Patil Pratishthan College of Engineering and Visual Arts, (VPPCOE&VA), Sion, Mumbai-22, Maharashtra.

Dr Vijay Kumar Burugari

Associate professor, Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram

Mrs Prathibha Ganapuram

Assistant Professor, Department of CSE, MallaReddy Engineering College (Autonomous), Maisammaguda, Hyderabad, Telangana, India-500100

Abstract

Compiling a dataset for analysis represents a pivotal yet frequently time-consuming stage in data mining endeavors. This involves executing intricate SQL queries that include table joins and column aggregations. However, conventional SQL aggregations have limitations, typically generating one column per aggregated group. Consequently, manually constructing datasets with the required horizontal layout becomes a significant endeavor. Addressing this challenge, we present a collection of straightforward and effective methods designed to automate the generation of SQL code. These methods streamline the return of aggregated columns in a horizontal tabular format, presenting a set of numbers rather than a singular number per row. Referred to as "horizontal aggregations,” these functions shape datasets in a de-normalized, horizontal layout, such as point-dimension, observation variable, or instance-feature—aligning with the preferred standard format of most data mining algorithms. Our proposed evaluation methods for horizontal aggregations encompass CASE, leveraging the programming CASE construct; SPJ, grounded in standard relational algebra operators (SPJ queries); and PIVOT, making use of the PIVOT operator found in specific database management systems.

Issue

Vol. 22 No. 01 (2023)

Section

Articles

This work is licensed under a Creative Commons Attribution 4.0 International License.

Article Sidebar

Main Article Content

Abstract

Article Details