Dear Editor, We read with interest the article by Hisamura et al, “Frequency of and factors associated with social problems among working-age patients with cancer in Japan.”1 The authors address an important and understudied aspect of cancer survivorship in working-age adults in Japan, and their large internet-based sample and attention to multiple domains of social problems add useful information to the field. We would like to offer several methodological comments that may assist readers in interpreting the findings and may inform future work. First, the description of the multivariable logistic regression analysis appears inconsistent. The Methods section states that the likelihood ratio method for stepwise variable selection was used, yet it is also stated that all 18 independent variables were included in the models, regardless of statistical significance. These statements describe different strategies. If stepwise selection was applied, then some variables may have been excluded from the final models, with associated risks of biased coefficients and unstable P values, as highlighted in the regression modeling literature.2,3 Stepwise procedures in logistic regression are known to increase type I error and to overfit associations with idiosyncrasies of the sample. Conversely, if all predictors were forced into each model, then the mention of stepwise selection is misleading and prevents readers from fully understanding the modeling process. Clear reporting of whether models were full or reduced is essential for assessing internal validity and transportability. Second, the handling and description of missing data require clarification. Early in the Methods section, the authors state that there were no missing values because respondents were required to answer all items. Later, however, several response options such as “do not want to answer,” “do not know,” and “not applicable” are recoded as missing for the purposes of regression, and missingness reaches 27.7% to 61.5% for some variables, followed by listwise deletion. From an analytic standpoint, this constitutes substantial missing data. Listwise deletion under such levels of missingness can markedly reduce effective sample size, lead to loss of precision, and introduce bias if data are not missing completely at random.4 Contemporary recommendations favor multiple imputation or other principled methods for handling missing data in observational studies, particularly when missingness is related to measured covariates or outcomes.4,5 A more detailed account of why listwise deletion was chosen and how many participants were retained in each model would help readers gauge the potential impact on results. Third, the sample size justification is framed in terms of a “10 cases per independent variable” rule for multivariable regression, and the target of approximately 200 participants per age stratum is based on the number of predictors. Yet, in logistic regression, the relevant quantity is the number of events per variable (EPV), not total sample size. Simulation studies suggest that low EPV can yield biased estimates, wide confidence intervals, and poor model calibration.2,6 Given the relatively low prevalence of some severe social problems and the high proportion of analytically missing data, it is possible that some models had EPV well below conventional benchmarks. This may partially explain very large odds ratios with wide confidence intervals for some predictors. Presenting the number of events and nonevents used in each model would allow readers to better assess model stability. Fourth, the analysis strategy involves a large number of separate logistic regressions, each including the same set of 18 predictors for 20 different dichotomized outcomes. Although the authors appropriately use Bonferroni correction for some univariate comparisons, there is no comparable adjustment or discussion regarding multiplicity in the multivariable analyses. When many hypothesis tests are performed, the probability of chance findings increases, particularly in combination with selection procedures.3 While strict statistical adjustment is not always required, an explicit acknowledgment of the multiple testing burden and a focus on the consistency and plausibility of associations would strengthen interpretation. Finally, we note some conceptual overlap between certain predictors and outcomes and the risk of causal overinterpretation. For example, current work status is included as a predictor for outcomes that concern returning to or remaining at work. This proximity may lead to circular interpretations and make it difficult to disentangle determinants from consequences. The cross-sectional design also precludes assessment of temporal ordering. Readers are, therefore, best advised to interpret the observed associations as indicators of co-occurring vulnerabilities among working-age patients with cancer, rather than as evidence that particular psychological or clinical factors causally determine social problems. These comments do not detract from the importance of the authors' contribution in documenting the burden of social problems among working-age adults with cancer in Japan. Our aim is to emphasize that transparent reporting of model-building strategies, careful handling of missing data, and attention to events per variable and multiple testing are critical to maximize the robustness of inferences drawn from observational surveys. We hope that future work in this area will build on this valuable study while incorporating current methodological guidance on regression modeling and missing data.
Building similarity graph...
Analyzing shared references across papers
Loading...
Ankur Sharma
Sushma Narsing Katkuri
Varshini Vadhithala
Journal of Psychosocial Oncology Research and Practice
Saveetha University
Jaypee Institute of Information Technology
Sharda University
Building similarity graph...
Analyzing shared references across papers
Loading...
Sharma et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69a287a00a974eb0d3c03698 — DOI: https://doi.org/10.1097/or9.0000000000000190
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: