Data sgp is the process of collecting, storing and managing information about the performance of a business. This information is used to assess the organization’s current and future performance, and is often compared with benchmarks to identify areas for improvement. It is important for businesses to have a solid plan in place to ensure that they are using their data effectively. This includes policies that deal with data governance, which encompasses everything from collection and storage to usage and oversight. Data governance is also important for making sure that the right data is being leveraged at the right time.
The sgp package contains classes, functions and data that allow for the calculation of student growth percentiles and percentile growth projections/trajectories based on large scale, longitudinal education assessment data. The sgp package uses statistical methods known as sparse generative models and variational inference to estimate coefficient matrices that represent the likelihood of observed data given a set of model parameters. The sgp package is available on CRAN for use with the open source software environment R. Using R to run SGP analyses requires familiarity with the program.
In order to perform SGP analyses, teachers must have access to the mSGP data for their students. This data is compiled annually from multiple assessments in different content areas. For example, a teacher’s 2024 mSGP score is determined by adding together scores for reading, math and science for each of the three most recent years of assessments. These scores are then weighted to determine a single yearly average score that is used for evaluation.
SGP analyses require a dataset in the WIDE or LONG format, depending on the analysis being performed. The lower level SGP functions, studentGrowthPercentiles and studentGrowthProjections, only require the WIDE format while higher level wrapper functions, such as summarizeSGP, require the LONG data format. It is recommended that the LONG format be used for all analyses as it provides significant preparation and storage benefits.
For the best results, it is important to have complete and accurate data. This is especially true for a longitudinal study. Incomplete or inaccurate data can lead to inaccuracies in the estimates of model parameters. Moreover, incomplete or inaccurate data can lead to invalid conclusions about the effects of different variables on a final outcome. In addition, errors can result from incorrect assumptions about the underlying distribution or noise variance on training data. Fortunately, many of these errors can be avoided by careful data preparation. In fact, almost all errors encountered in SGP analyses revert back to problems with data preparation. Taking the time to prepare data correctly is the first step toward reliable, operational SGP analyses.