An Investigation into the Estimation of a Positive Case of COVID-19: A Comparative Study between Two Phases of the Pandemic

In Japan, the policy for polymerase chain reaction (hereafter PCR) testing changed significantly after 7 May 2020; from 4 February to 6 May, PCR testing was limited to certain patients with severe symptoms. After 7 May, the PCR test was made available to a broader range of patients due to health insurance coverage. The study aims to test whether there is a significant relationship between the conditions under which PCR tests are performed, the number of tests after 7 May, and the positive results. Using a multiple regression model, we obtained the unexpected result even if we assume that PCR testing had been carried out during 4 February to 6 May at the same level as after 7 May. The number of positive cases would have been even lower than the actual number, which we have attained. This suggests that even if PCR testing had been plentiful throughout the entire period, the number of positives that would have been captured would not necessarily have been more significant than the actual number. This estimation might suggest that the infectivity of COVID-19 varied over time. It may suggest that, over time, the infectiousness and spreading power of COVID may be transformed. Therefore, further research investigating the epidemic impact of COVID is required, which is critical for humankind.

The authors are specialists in the social sciences and the field of social statistics. This research aims to contribute to the commentary on the current disruptive environment in the public health domain by providing society with information about who has been and who will be dealing with the impact of this epidemic, detailing a new landscape based on models created using estimated data.
The data used for the study is provided by the Japanese Ministry of Health, Labour, and Welfare as open data (Ministry of Health, Labor and Welfare, 2020a), which is published online. According to the data obtained from the ministry, the changes in the number of PCR tests conducted in Japan over the two different periods between the outbreak -until 6 May 2020 -and the off-peak -after 7 May to 16 July 2020, is noticeable. This is mainly because Japanese national health insurance started to apply to the PCR tests conducted from 7 May onwards; since then, the number of tests has increased drastically throughout Japan. Therefore, this paper targets the first outbreak period from 4 February to 6 May, defined as the 'first half of the epidemic' and the 7 May to 16 July as the 'second half of the epidemic'.
Our research is based on the notion that the number of the PCR tests conducted should be the basis for estimating the number of cases in the first half of the pandemic. We have quantified and modelled the estimations of the number of PCR tests and positive case numbers. From this approach, a theoretical number of PCR tests administered is estimated using a regression equation based on statistics concerning the second half of the pandemic. In addition, we attempt to provide a basis for determining whether the condition after July is more severe by comparing the estimated theoretical number of positive rates in the first half and the number of positive cases from the second half.

Operation of PCR Tests and Observation
The principle of a PCR test is to amplify millions of copies of a single molecule of DNA in a short amount of time (Kucirka et al., 2020). Three consecutive steps are required to achieve amplification.
Step 1, denaturation, which heating a doublestranded DNA template to separate DNA strands.
Step 2, annealing which binding a short DNA molecule called a primer to the adjacent region of the target DNA.
Step 3, elongation which the polymerase synthesises the complementary strand of the template starting from each primer. This three-step 'cycle' is repeated 25-35 times to exponentially synthesise an exact copy of the target DNA (Thermo Fisher SCIENTIFIC, 2020;Mullis, 1987).
However, only about 70% of people show a positive result to the PCR test, which is negative. There are as many as 30% of cases. Sensitivity, specificity, and positive predictive value are indicators of test accuracy. Sensitivity is the percentage of people who have a disease that tests positive, while specificity is the percentage of people who do not have a negative or positive result. The median rate is the percentage of people who test positive and suffer from the disease (The University of Tokyo Health Promotion Headquarters Health Center, 2020).

Statistical Discussions Based on Public Data
Mathematical analysis was conducted in immunology include the SEIR model (Hokkaido University School of Medical Statistics, 2020). This is a model that takes these four letters of four categories of data: non-immune [Susceptible], infected and incubating [Exposed], affected [Infectious], and recovered [Recovery]. Furthermore, three additional data categories are required to measure the transition rate: basic reproduction arithmetic [RO], average incubation period, and average infectious period. SEIR is an elaborate model that outsiders cannot retest because it only uses data available to experts. These constraints mean that researchers cannot easily use the SEIR model. In Figure 1, The grey line is the number of PCR tests on the left axis, and the black line is a daily record of the number of positive individuals on the right axis. In the first half of the year, the conditions for undergoing the test were rigorous. It was required that the patient exhibit a fever of 37.5 degrees centigrade or higher for four days or more, together with cold symptoms, fatigue and dyspnea (Ministry of Health, Labor and Welfare, 2020b). Before 6 May, the conditions of PCR tests were restricted to specific groups of patients; thus, the statistical number of PCR tests conducted during this time is relatively low, 9,252 as of 13 April 2020 (Ministry of Health, Labor and Welfare, 2020a). This is lower than the number of tests conducted by other developed countries; according to an OECD report on diagnostic testing data (OECD, 2020), Japan is the second-lowest member state regarding the number of tests conducted (Figure 2).  For instance, since the national insurance has got to be applied to the PCR tests on the 7 May, the number of conducting PCR tests on the day jumped to 13,005: then after this date, the average number of tests conducted until 16 July remained around 5,000 per day. The total number of PCR tests conducted between 16 January and 6 May, defined as the outbreak period, was 158,267 in total, with 15,660 positives and a positive ratio of 9.89%. Then, look into the same statistics in the second half of the off-peak period: Between 7 May to 16 July, the total number of PCR tests conducted was 352,135 with the number of positives 7,845. The positive ratio was 2.23% when the numbers of PCR tests in the first and second periods are subjected to a t-test, sig. <0.1%; this implies that there is a significant difference between the two periods.
There is a significant difference in the number of positives, sig. <5%. The black bar graph in Figure 1 represents a positive person, and the grey bar graph represents the total number of PCR tests conducted on a single day. The period when the number of infected people exceeded 500 every day was when the spread of infection was most feared 10 April -18 April; exceeding 300 people were detected to positives after the second week of July was viewed as mostly infected by virus. However, the PCR tests conducted in the first and second halves of the pandemic are not the same. In engineering, it is a standard practice to compare things by making the environment the same or simulating the same conditions. Therefore, we will try to simulate the number of PCR tests conducted in the first and second halves of the pandemic under the same conditions.

Results a. Data and Analytical Approach
The data provided by the Ministry of Health, Labour and Welfare open data (Ministry of Health, Labor and Welfare, 2020a) includes the number of PCR tests performed per single day, the number of Positive PCR tests per single day, the cumulative number of inpatients, the cumulative number of discharge/medical treatment cancellations. And the cumulative number of deaths. The following five factors were obtained as a result of this work: the number of PCR tests performed per single day; the number of positive PCR tests per single day; the number of inpatients per single day; the number of discharge/medical treatment cancellations per single day; and the number of deaths per single day. The raw data used for this study is displayed in Appendix.
Using these five data groups, we created a multiple regression model for the second half. The attained model was going to be used to estimate the number of PCR test could have been conducted and the hypothetical number of positive cases. IBM SPSS Version 26 was used for the analysis. The dependent variable was the actual number of PCR tests conducted in the second half of the period. Out of four independent variables in estimating the actual text numbers are significant. The four variables were sequentially inputted using the stepwise method with SPSS. Except for two variables, the number of discharges/cancellations and the number of deaths resulting from no significance in estimating the dependent variable, two variables; the number of positive cases, and the number of inpatients has been presented as significant variables for the estimation. The model's validity is 0.648 for R and 0.420 for R2, implying that the attained model can explain 64.8% of the whole dataset. The Durbin Watson value is 1.799, which also implies the model is reliable and compatible with the dataset. The beta is large enough in other indices, and the significance probability is 5% or less, so the developed model is reliable and trustworthy.

b. Estimate of the Number of PCR Tests that might have been Carried Out
As Equation 1, the estimated number of standardised PCR tests per single day is obtained from the data showing the number of positive PCR tests per single day and the number of inpatients per single day in the previous period. The cumulative number of PCR tests measured in the previous term was 156,387, while the estimated value was 510,695. In other words, it can be inferred that if the inspection process were to be the same in the first half as in the second half, the number of inspections would have been 3.26 times greater. Figure 3 demonstrates two values: the estimated drawing of the black line and the actual values that draw the ash line.

Figure 4. A Comparison of the Actual Number of Positive Cases and the Estimated Number of Positive Cases
Based on the estimated number of PCR tests that could have been carried out, the estimated number of positive cases before 6 May is shown in Figure 4. It shows that the estimated number of positive cases that draw the black line is lower than the actual number of positive cases that draw an ash line.

Discussion
The estimated number of positive cases is calculated as 11,235, 4,156 less than the number of actual positives in the first half of the epidemic 15,391. This suggests that the properties of COVID-19 have changed, as the second half is more attenuated than the first half, or that the human antibody response has changed enormously or both. Alternatively, it may mean that an unknown factor has been added.
This study dealt with the number of PCR tests and the number of PCR positive cases, but the number of PCR positive cases and deaths have not yet been studied. Regarding the cause of death, the Ministry of Health, Labor and Welfare issued a notice on June 8, 2020, stating that PCR-positive cases should be reported as due to COVID-19 without specifying the cause of death. (Ministry of Health, Labor and Welfare, 2020c). However, in the case of influenza, statistics are kept by clearly separating direct and indirect deaths. After the notification, statistics that distinguish between direct and indirect deaths are not available in the case of COVID-19 in Japan. This is expected to be an obstacle to future statistical analysis.

IV. Conclusion
Although this study has provided a practical and feasible analytical model for further research, it remains at a pilot test level. For example, Relationship among deaths numbers and PCR test numbers and PCR positive numbers and other elements, is not yet to analyzed. We have acknowledged the potential of a future collaborative approach that includes specialists in infectious diseases and immunology, in addition to statisticians and engineers that can develop robust predictive models to support public health decision making. The negative impact of COVID-19 and other virus-oriented diseases is not limited to the medical and health domains (van Eeden et al., 2020). It should be at the top of public health agendas, with particular attention paid to vulnerable citizens, including disabled people, infants and other younger children (Dijk, 2020). COVID-19 has caused extensive societal, economic, and psychological impacts on humans within a disrupted environment. Additionally, how best t o support stressed and overworked medical staff (Missel, 2020) is a priority. Therefore, further actionable interventions that can establish a safe and secure lifestyle in the 'new normal' era are essential.