Back to Journals » Patient Preference and Adherence » Volume 9

Development of the CoMac Adherence Descriptor™: a linguistically-based survey for segmenting patients on their worldviews

Authors Connor U, Mac Neill R , Mzumara H, Sandy R

Received 26 November 2014

Accepted for publication 26 January 2015

Published 26 March 2015 Volume 2015:9 Pages 509—515


Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 3

Editor who approved publication: Dr Johnny Chen

Download Article [PDF] 

Ulla M Connor,1 Robert S Mac Neill Jr,1 Howard R Mzumara,2 Robert Sandy1

1International Center for Intercultural Communication (ICIC), Indiana University – Purdue University, Indianapolis, IN, USA; 2Testing Center – Division of Planning and Institutional Improvement, Indiana University – Purdue University, Indianapolis, IN, USA

Abstract: Nonadherence to prescribed medication and healthy behaviors is a pressing health care issue. Much research has been conducted in this area under a variety of labels, such as compliance, disease management and, most recently, adherence. However, the complex factors related to predicting and, more importantly, understanding and explaining adherence, have nevertheless remained elusive. However, through an in-depth linguistic analysis of patient talk, the International Center for Intercultural Communication (ICIC) at Indiana University has produced a psycholinguistic coding system that uses patients’ own language to cluster them into distinct groups based on their worldviews. ICIC’s studies have shown, for example, that patients reveal their fundamental perceptions about themselves and their environment in their life narratives; clustering of individual patients based on these different perceptions is possible via the use of differential language in survey questions, and differential language can be used to tailor messages for individual patients in a manner that these individuals prefer over generically worded communication. In grant-funded research, an interdisciplinary team of researchers at the ICIC reviewed the literature and identified three basic psychosocial tenets related to adherence: control orientation, based on locus of control research; agency, based on self-efficacy; and affect or attitude and emotion. These three constructs were selected because, in the published literature, they have been consistently found to be connected to patient adherence. Based on this research, a survey, the CoMac Descriptor™ was developed. This report shows that The Descriptor™ questions and responses are valid and reliable in segmenting patients across psychosocial constructs, which will have positive implications for health care providers and patients.

Keywords: adherence, communication, diabetes


Nonadherence to prescribed medication and healthy behaviors is a pressing health care issue. Much research has been conducted in this area under a variety of labels, including compliance, disease management and, most recently, adherence. Researchers have examined a wide range of variables such as psychological characteristics, health beliefs, and demographic information. Vermeire et al1 and van Dulmen et al2 provided meta-analyses of this research.

The complex factors related to predicting and, more importantly, understanding and explaining adherence nevertheless remain elusive. A frequently expressed concern is that research has typically examined adherence from the perspective of health care professionals rather than from the perspective of people living with a particular health issue. In response to this concern, van Dulmen et al2 have called for patient perspectives to be included in future research.

Through the in-depth linguistic analysis of patient talk, the International Center for Intercultural Communication (ICIC) at Indiana University (IU) (Indianapolis, IN, USA) has produced a psycholinguistic coding system that uses patients’ own language to cluster patients into distinct groups based on their worldviews. This patient-focused research has contributed three significant insights with regard to patient perspectives that are relevant to health communication.35 These are:

  1. Patients reveal fundamental perceptions of themselves and their environment in their life narratives. Specifically, not only do people see themselves and their world differently, they use different words and language structures to describe events in their lives and in the management of a disease.
  2. Clustering of individual patients based on these different perceptions is possible via the use of differential language in survey questions. Accordingly, individuals self-select survey responses that contain the language that reflects their worldviews – ie, their perceptions of self, their environment, and health beliefs.
  3. Differential language can be used to tailor messages for individual patients in a manner that these individuals prefer over generically worded communication.

These insights were derived from in-depth interviews with 65 patients, ranging in duration from 45 minutes to 3 hours. The researchers’ goal was to put this research into practice; specifically, to find a more effective way to segment patients than conducting long and costly in-depth patient interviews and analyses. In this article, we outline the path and methods employed to create a brief and effective survey that meets the challenge of cost-effectively clustering patients based on their perceptions and worldviews. The remainder of this article describes the stages in the development process.


The following stages provide an overview of the research methodology. In stage 1, the identification of three key psychosocial dimensions related to adherence first took place. Secondly, we identified the unique linguistic features that accompany each psychosocial dimension, as reported in Connor et al,3 Connor and Lauten,4 and Connor et al.5 In stage 2, based on the linguistic feature systems from stage 1, we developed a survey instrument that segmented patients based on their differential worldviews and health beliefs. In stage 3, we tested the validity and reliability of the survey with type 2 diabetes patients. In stage 4, we adapted the survey to other conditions – specifically, hypertension and attention deficit hyperactivity disorder (ADHD) caregivers.

Stage 1: identification of key constructs and linguistic features

In grant-funded research, an interdisciplinary team of researchers at the ICIC reviewed the literature and identified three basic psychosocial tenets related to adherence: control orientation,6 based on locus of control research; agency,7 based on self-efficacy; and affect8 or attitude and emotion. These three constructs were chosen because, in the published literature, they have been consistently found to be related to patient adherence.1 In our research, control orientation refers to the perceived amount of control a person has over disease-related events occurring in their life, while agency means an individual’s capacity to follow through on instructions.5 Affect conveys how the patient perceives outlook/consequences of their disease self-management. As agency can be categorized as high and low, affect can be grouped into positive and negative.9

Using an interview protocol consisting of open-ended questions in four different categories – going back to diagnosis, attitude to medication, attitude to doctor/health care provider, and managing diabetes – 43 interviews were conducted with native English-speaking patients and 22 interviews were conducted with Spanish-speaking patients in a Midwestern state in the United States. Following institutional review and approval for the protection of human subjects, participants were recruited from diabetes health clinics in a US Midwestern city. Linguistic analyses of the data produced linguistic feature systems to describe the patient talk for each of the constructs: control orientation; agency; and affect. The constructs were found to be cross-cultural, with language-specific manifestations, as discussed in the published work.35,9

Stage 2: development of the survey instrument

Based on this linguistic research,35,9 a survey (the CoMac Descriptor™) was developed. The CoMac Descriptor™ includes questions that invite patients to self-identify with the actual words and word structures used by other patients with the same chronic illness. The same categories of questions used in the ICIC research were employed: recalling the initial diagnosis; attitude to medication; attitude to doctor/health care provider; and managing diabetes. Based on the responses selected by the patient, the survey categorizes patients into one of the eight clusters (Table 1). The survey was to be administered in 10–15 minutes, on paper or online.

Table 1 CoMac Descriptor™ clusters

Stage 3: testing validity and reliability of the survey among diabetes type 2 patients

The initial CoMac Descriptor™ was a 27-item survey. In order to test its validity and reliability, the instrument was given multiple times to patients at a suburban US Midwest diabetes treatment center. This clinic had access to a diabetes type 2 patient population with repeated clinic visits over a long period of time because the clinic provided free medications. The participants were recruited following institutional reviews and approvals. The results of diabetes studies are reported in Clark et al.10 The CoMac Descriptor™ validity was first tested with 20 patients with type 2 diabetes. The clustering results, using the CoMac Descriptor™, were compared to the clustering results of three trained linguists who conducted in-depth interviews with the same patients. The three linguistic coders had 100% agreement among themselves on the individual clustering results. The results from the CoMac Descriptor™ clustering were then compared to the interviewers’ clustering to determine the concurrence of the domain assignment by the two methods. There was, overall, 75% agreement between the CoMac Descriptor™ results and the individual in-depth linguistic interviewers’ results; the agreements between the CoMac Descriptor™ and control orientation, agency, and affect were 75%, 70%, and 80%, respectively. The reliability of the survey data was calculated with 37 patients using Cohen’s kappa, and the value was 0.717.

Another step in the validation process was to compare the results generated by the CoMac Descriptor™ with the impressions of a highly experienced clinician, who was actively engaged in the care of the patients being studied. The clinician was the head nurse (who is also a trained diabetes nurse educator) responsible for the care of the surveyed patients. She knew the patients’ health care management behaviors well. She was familiarized with the psycholinguistic constructs and was asked for her evaluation of the patients according to the eight psychosociolinguistic segments. Of the 16 patients with diabetes, in 13 cases there was agreement between the clinicians’ classification of the patients and that derived from the CoMac Descriptor™. This step reinforced the conclusion that the survey would yield clinically meaningful and actionable information. The clinician was favorable about our segmentation approach and suggested that it could be considered a “structured” version of the intuitive insights of experienced health care professionals; and that, especially for the less experienced practitioners, this structure was of significant value.

Stage 4: adaptation of the survey to other chronic conditions

The next step focused on the broader application of the survey to other chronic conditions. Specifically, we modified the CoMac Descriptor™ for hypertension and ADHD caregiving. We also wanted to determine whether the hypertension results would be consistent across cultures and languages, and whether hypertension patient segmentation would yield comparable results to those of the survey used for diabetic patients in English. The data for the hypertension study came from a clinical study of adherence conducted by Quintiles, Inc. (Durham, NC, USA) in the United Kingdom, Germany, Italy, and Spain with institutional reviews and approvals.11,12 The samples were drawn from the Quintiles Mediguard database of hypertension patients who signed up for drug and health care information, and from the participating clinics. The participants in the ADHD context were recruited from a large database of patients and health care providers in collaboration with Verilogue Inc. (Horsham, PA, USA), with institutional reviews and approvals.13

With both modifications (hypertension, ADHD), the following steps were taken:

  1. Two in-depth interviews with physicians treating the condition;
  2. Ten in-depth interviews with patients (hypertension) and caregivers (ADHD);
  3. Linguistic analysis of the interview data in line with what had been done with diabetes patients; and
  4. Modifications of the questions for the specific disease state.

Stage 5: psychometric analysis of all administered surveys

To evaluate the usefulness of The Descriptor™, we used psychometric analyses to answer the following questions:

  1. How effective are The Descriptor™ questions at identifying differential subject perceptions?
  2. How consistent is The Descriptor™ in its ability to accurately segment patients across different disease states and different environments? (Total number [N] =636 patients – hypertension, diabetes, ADHD; administered in the United States, UK, Germany, Italy, and Spain).

Question 1

How effective are The Descriptor™ questions in the identification of differential subject perceptions? Two analytical techniques were chosen to refine the individual question, either by eliminating weaker questions or by rewording the questions to make them more effective in differentiating between the psychosocial constructs:

  1. Concurrence measured the percentage of agreement between a single question and the results of all other questions on the same construct; and
  2. Latent class analysis (LCA), which looked for sets of questions that were correlated to the same unobserved (latent) construct.

Concurrence is a deterministic analysis in that constructs measured by each question are assigned a priori based on psychosociolinguistic theory. Concurrence is expressed in percentages. A value of 100% for a question means that it always agrees with the results based on the majority of the remaining questions in the same construct. A score of 0% would mean that the results for a question never agree with the results of the other questions. Each question was developed around a particular dimension and its two poles – internal versus external control orientation, high versus low agency, and positive versus negative affect.

LCA is a probabilistic analysis that relates a set of observed variables to a set of latent variables. It is a measure that tests whether the questions aimed at a specific domain (eg, agency) belong to the same latent (unobservable) class (eg, high agency versus low agency). The probability of the correct prediction of the patient’s domain was as high as 100%, and directionally correct in each of the 16 questions.1416 In our instrument, the observed variable would be the answer choice of the subject. Recall that each answer choice reflects the differential language usages and patterns of individuals with different world- and self-views, and that the latent variables would be the different psychosocial dimensions. In Tables 24, the numbers represent the probability that a person with a certain psychosocial dimension selected the answer that was created with his or her language versus the probability that he or she selected a choice with the language of the opposite pole of the dimension. As an example, question 1 in Table 2 is defining the probability that a “high” agency subject picks the “high” agency option, as well as the probability that a “high” agency subject picks a “low” agency option. As can be seen, this is a powerful question, as the probability of a “high” agency person picking the “high” agency answer is 100% and the probability of a “high” agency person picking the “low” agency answer is 0%.

Table 2 Question selection values – agency (high and low)

Table 3 Question selection values – affect (positive and negative)
Abbreviation: NA, not applicable.

Table 4 Question selection values – locus of control (internal and external)

The two psychometric types of analyses described earlier were applied to 636 subjects: 358 hypertensive patients; 140 diabetic type 2 patients; and 138 ADHD caregivers. Tables 24 identify the values for four of the retained questions in each dimension. An additional example is given for a nonretained question. The criteria for the selection and ranking of questions in the most recent CoMac Descriptor™ were as follows:

  1. First priority given to LCA in the largest populations. Specifically, the LCA value used for a particular question would come from the largest population of subjects. As an example, the LCA analysis for question 1 in Table 4 came from the hypertension study (358 subjects) versus the diabetes study (140 subjects).
  2. Priority was given to questions with both strong LCA and concurrence. Specifically, question 3 ranks above question 4 in Table 3 because it has both types of analyses, even though the LCA analysis is not as powerful.

One question that was modified from its original form is the following “agency” question, which was number 19 in first diabetes Descriptor™ and number 18 in the hypertension Descriptor™: “Which of the following describes how well you control XXXXX in general? Choose one option.”

  1. I am in control most of the time.
  2. I am usually in control but not always.

When a concurrence analysis was performed, the question was deemed to be acceptable with a score of 52/37. However, LCA revealed a problem with the question, scoring 0/100 and 33/66 for the diabetes population, respectively, indicating that none of the people determined to have high agency chose the high-agency answer. The numbers for the hypertension population, 66/33 and 67/34, respectively, were better but not ideal. Reviewing the answers linguistically, it was determined that the answers were too similar. A choice was made to differentiate the answers even more in order to eliminate the false “low” answers and bolster the correlation between the chosen answer and the patients’ correct construct poles. The new answers are:

  1. I am in control.
  2. I am not very good at managing my diabetes.

In the most recent administration of The Descriptor™ to 21 diabetes patients, the revised question correctly correlated to the patients’ overall agency dimension (high or low) in 20 out of the 21 cases.

Question 2

How consistent is The Descriptor™ in its ability to accurately segment patients across different disease states and different environments? (N=636 total patients – hypertension, diabetes, ADHD; administered in the US, UK, Germany, Italy, and Spain).

The hypertension clusters were from a study by Sandy et al.11 Diabetes data were from Clark et al.10 ADHD caregiver data were reported by Barnett and Connor.13

As indicated by Table 5, the distribution of patients across clusters is generally consistent. For example, the categories “high agency”, “positive affect”, and “internally motivated” were the largest for diabetes and ADHD, and almost the largest for hypertension. Conversely, the “low agency”, “negative affect”, and “externally motivated” categories were the lowest for all three diseases. This indicates the robustness of the psychosocial constructs across disease states and cultures. Of special note is that this consistency also occurs whether dealing with patients or with caregivers.

Table 5 Clustering results across disease states and countries
Abbreviations: HYP, hypertension; ADHD, attention deficit hyperactivity disorder; IPH, internal positive high; IPL, internal positive low; INH, internal negative high; INL, internal negative low; EPH, external positive high; EPL, external positive low; ENH, external negative high; ENL, external negative low.


We have demonstrated that The Descriptor™ questions are valid and reliable in segmenting patients across psychosocial constructs. While the current versions of The Descriptor™ are being used in studies that are confirming its value, we continued to test and refine The Descriptor™ with the goal of producing a 12-question version with even better LCA and concurrence values, with intercultural applications and for many disease conditions.

The Descriptor™ provides a unique patient profiling tool for use by health care providers in communicating with patients. The segments identified by The Descriptor™ permit linguistic tailoring of education messages for those segmented members.17 Table 6 shows appropriate communication strategies for the constructs. In particular, identifying individuals with common linguistically-based psychosocial characteristics and tailoring content accordingly will encourage patient attention to important health messages. In busy practice settings, identifying the psychosocial characteristics as a basis for tailoring health messages is particularly helpful.

Table 6 Linguistic and theoretical construct features for message development

In addition to the development of the segmentation process and identifying accompanying communication frameworks and strategies, a feasibility study tested patient and health care provider satisfaction of the messages based on these constructs in face-to-face clinic consultations.17 Both patients and health care providers preferred the tailored messages over the standard messages recommended by the American Association of Diabetes Educators. The tailored messages were preferred because they were perceived as more personalized because the recipients felt that such messages addressed them more intimately than did the standard messages. This sense of intimacy would warrant the message recipient’s attention to the content and ultimately contribute to positive outcomes on their health.

It is found that clustering patients based on their worldviews and perceptions is cost effective. When the linguist conducted a 1-hour (or longer) one-to-one interview with 65 patients to talk about their disease state and experiences in the initial work, the cost of such interviews exceeded US$350. The costs included scheduling the patient, travel and fees for the linguist, fees for linguistic validation and, sometimes, patient honoraria. In contrast, the fee for a validated survey, The Descriptor™, was $10, and had no personnel or geographic limits. The survey tool leads to more effective communication, which ultimately results in greater patient engagement and healthier behaviors.


In conclusion, our approach focuses on the patient perspective by defining how the language that patients use conveys the health beliefs and worldviews of those patients. We provide an innovative approach that goes beyond existing communicative strategies by 1) being able to efficiently segment patients according to those views via a brief survey; 2) provide health care providers with this knowledge; and 3) provide the health care providers with a communication strategy and structure to convey key health care behavior messages that are consistent with the patients’ own language and usage. This leads to more effective communication which, in turn, changes health behaviors and, ultimately, health outcomes. This patient-centered approach provides health care providers with effective, individualized communication strategies.

We anticipate future implementation studies will provide critical information about the usability of the intervention in clinical settings, will demonstrate patient and provider satisfaction with the intervention, and will measure the impact on selected health behaviors.


Ulla M Connor is the chief scientific officer at CoMac Analytics, Inc., the developer of The Descriptor™ instrument. Robert S Mac Neill Jr is the chief executive officer at CoMac Analytics, Inc. Robert Sandy is a principal at CoMac Analytics, Inc. The authors report no other conflicts of interest in this work.



Vermeire E, Hearnshaw H, Van Royen P, Denekens J. Patient adherence to treatment: three decades of research. A comprehensive review. J Clin Pharm Ther. 2001;26(5):331–342.


van Dulmen S, Sluijs E, van Dijk L, de Ridder D, Heerdink R, Bensing J; International Expert Forum on Patient Adherence. Furthering patient adherence: a position paper of the international expert forum on patient adherence based on an internet forum discussion. BMC Health Serv Res. 2008;8:47.


Connor U, Goering EM, Matthias MS, Mac Neill R. Information use and treatment adherence among patients with diabetes. In: Ruiz-Garrido MF, Palmer JC, Fortanet-Gómez I, editors. English for Professional and Academic Purposes. Amsterdam, the Netherlands: Rodopi; 2009:89–104.


Connor U, Lauten K. A linguistic analysis of diabetes patients’ talk. In: Hamilton HE, Chou WS, editors. The Routledge Handbook of Language and Health Communication. New York, NY: Routledge; 2014:91–108.


Connor U, Anton M, Goering E, et al. Listening to patients’ voices: linguistic indicators related to diabetes self-management. Commun Med. 2012;9(1):1–12.


Wallston KA, Wallston BS, DeVellis R. Development of the multidimensional health locus of control (MHLC) scales. Health Educ Monogr. 1978;6(2):160–170.


Bandura A. Self-efficacy: toward a unifying theory of behavioral change. Psychol Rev. 1977;84(2):191–215.


Martin JR, White PR. The Language of Evaluation. New York, NY: Palgrave Macmillan.


Anton M, Goering E. Understanding Patients’ Voices. Amsterdam, the Netherlands: John Benjamins; in press 2015.


Clark CM Jr, Connor U, Lauten K, Mac Neill R Jr, Sandy R. A linguistic approach to improving self-care and compliance. Journal for Patient Compliance. 2012;2(4):20–22.


Sandy R, Cascade E, Connor U, Cousins F. Variation in medication adherence across patient behavioral segments: A multi-country study in hypertension. In press 2014.


Cascade E, Sandy R, Connor U, Cousins F. Variance in medication adherence by patient behavioral segment: A multi-country study in hypertension. Poster presented at: ISPOR 17th Annual International Meeting, (June 2–6, 2012). Washington, D.C.


Barnett J, Connor U. Personalized behavioral messaging: driving adherence through language. Paper presented at: The Thirteenth Population Health and Care Coordination Colloquium; March 14; 2013; Philadelphia, PA.


McCutcheon AL. Basic concepts and procedures in single- and multiple-group latent class analysis. In: Hagenaars JA, McCutcheon AL, editors. Applied Latent Class Analysis. Cambridge, UK: Cambridge University Press; 2002:56–86.


Collins LM, Lanza ST. Latent Class and Latent Transition Analysis for the Social, Behavioral, and Health Sciences. New York, NY: Wiley; 2010.


Lazarsfeld PF, Henry NW. Latent Structure Analysis. Boston, MA: Houghton Mifflin; 1968.


Ellis RJ, Connor U, Marshall J. Development of patient-centric linguistically tailored psychoeducational messages to support nutrition and medication self-management in type 2 diabetes: a feasibility study. Patient Prefer Adherence. 2014;8:1399–1408.

Creative Commons License This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

Download Article [PDF]