Tips for Running Usability Tests

Nice to meet you. I'm Aoshima, a UI designer at KINTO Technologies. I usually handle the UI for business applications. A little while ago, we ran a usability test to see how customers use our website, aiming to use the insights for our site redesign. It was also something of a trial test for us, so we kept things small by recruiting participants from within the company. Even so, we were able to collect data that offered plenty of valuable findings. In this article, I'll share the outline of the test and tricks we used to carry it out.
What Is a Usability Test?
First of all, despite the name, it isn't about passing or failing. It's a method for evaluating three key factors that are essential to the concept of usability. So first, let's take a look at the general idea of usability.
The Definition of Usability
The term "usability" is often used in a broad or vague way, typically to mean how easy something is to use. However, it actually has a clear definition set by the international standard ISO 9241: "The extent to which specified users can use a product to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use." The three key elements of effectiveness, efficiency, and user satisfaction can be explained as follows:
- Effectiveness: whether users are able to achieve their goals. For example, on an e-commerce site, can users successfully complete a purchase?
- Efficiency: whether users, assuming they can achieve their goals, are able to do so via the shortest possible path, without unnecessary steps.
- Satisfaction: the degree to which users can operate the product comfortably and without frustration, even if there are no major problems with effectiveness or efficiency. For example, on an e-commerce site, if users can't complete a purchase in the first place, usability is either very low or completely absent. Even if users are able to complete their goals, usability is still low if it takes too many steps, like having trouble finding what they're looking for. And if the experience is frustrating or unpleasant for any reason, user satisfaction drops and so does overall usability. Leaving those issues unaddressed could lead to losing your important customers to competing products or companies. To prevent this from happening, it's essential to understand how customers behave when handling your product or using the website, and to take usability into account.
Usability Test: What It Can Do and What It's For
The foundation of a company's profit-making activities is ensuring that customers are happy to use its products and services, rather than feeling uncomfortable, so if there are any problems with the products or services, they should be improved. The first step is to identify those problem areas. Fortunately, we already had access to a customer survey conducted by our analytics team. The survey included targeted questions about which parts of the website users found confusing, and we used that feedback as a key reference point for our usability test planning.
If you can observe how customers behave at those problematic points, it can give you clues as to why those areas are causing trouble in the first place. Gathering this kind of hints is what usability testing can do. To put it another way, surveys are like the English tests you'd take in school. They're good at pinpointing what went wrong, like whether it was listening, grammar, or something else. Usability testing, on the other hand, is better at uncovering why something went wrong and what could be done to improve them.
Preparation Before the Test
Setting Tasks and Conditions
As a first step, we prepared tasks and conditions based on discoveries from a prior survey. The results showed that users across all age groups had trouble understanding certain areas of the website. So, we focused on those points and designed tasks to evaluate two key aspects: effectiveness and efficiency. Setting these tasks and conditions was important for two reasons. The first point was that if we let participants explore the site freely, they might complete the session without ever encountering the problematic areas. Setting tasks helped prevent that. The second point was to have participants with varying levels of digital literacy perform the tasks under the same conditions.
Prepare a Script, Questions, and Surveys
Next, we prepared three key materials: a script to explain the test to participants, a question sheet to understand their background and digital literacy, and two post-test surveys to be filled out after the session. To help participants feel comfortable taking part in the test, it was important to clearly explain the content and flow of the test beforehand. We also asked questions to better understand each person's background and level of digital literacy, so having a script helped ensure that everything was explained clearly and the test ran smoothly without missing any steps. Ice-breakers and other little things can sometimes take up more time than expected, so it's a good idea to set a rough timetable for the session if possible. The post-test surveys were designed to measure the third and final key point mentioned earlier: satisfaction. For this, we used two metrics called Customer Satisfaction Score (CSAT) and System Usability Scale (SUS). CSAT is commonly used in customer satisfaction surveys and measures how satisfied users are on a five-point scale. SUS, on the other hand, measures how users perceive aspects like ease of use and difficulty. It's widely used as a standard metric for evaluating overall UX. One reason SUS is especially useful is that it comes with a clear benchmark. If the score is below 68, it's a sign that usability needs to be reviewed, which makes the system easy to understand and practical.
Device Setup
As a final preparation, we prepared a smartphone and a laptop to record the participants' facial expressions and on-screen actions during the test. We used the smartphone to film hand movements and the laptop to record facial reactions, setting up both ahead of time. Once the test began, we logged into Microsoft Teams on both devices and used the built-in recording feature. This function is extremely handy because it automatically saves the recordings to the cloud and combines them into a two-screen layout, making review much easier. By the way, we used a smartphone stand from a 100-yen shop. As a side note, we used two separate video cameras, one for filming hand movements and the other for facial expressions some years ago. The footage had to be saved locally and edited manually to create a synchronized split-screen view for comparison. Thinking back on that process, I was genuinely impressed by how much easier testing has become in just the past few years.
Carrying Out the Test
Once all the preparation is complete, it's finally time to run the test. After a brief ice-breaker and some explanations and questions, we moved on to the task execution using the website. The test was conducted using the think-aloud method. In this approach, participants are asked to verbalize their thoughts as they perform an operation. By combining flat visual data with spoken thoughts as audio input, it allows us to understand their behavior from all angles.
Things to Watch Out for During Testing
There are two things you need to keep in mind during the test. First, because participants are not used to speaking out their thoughts while performing tasks, the interviewer needs to consistently prompt them to share what they're thinking to prevent silences. Another point is that participants often ask the interviewer questions during the test, but it's best to gently avoid answering them as much as possible (though not ignoring). During the pre-test explanation, we made it clear that the purpose of the test was not to evaluate how well the participant could use the website, but to assess how easy or difficult the website was to understand. Even so, when subjects felt unsure, they often ended up asking questions instinctively. However, answering those questions could introduce bias, so it was important to judge carefully whether a question was appropriate to respond to. Once the tasks were completed and the questionnaire filled out, the test came to an end.
Preparation for Analysis
After the test, the next step was preparing for the analysis phase. It would be nice to take a breather after wrapping up the test, but this was actually where the more time-consuming work began. The first thing to do was transcribe the text.
For Information Sharing
The audio recordings from the test were about 20 to 30 minutes per participant but transcribing them took quite a bit of time since we often had to rewind and replay unclear parts. This might have been the toughest part. That said, converting time-based audio into plain text made information sharing much easier. For the sake of future analysis and collaboration, this was a step worth sticking with, even if it required quiet persistence. (The automatic transcription tools still felt far from reliable at the time.) The next step was to categorize and tag the spoken content to make it easier to organize. We first compiled everything chronologically in a spreadsheet, then copied it into a tool like miro. This allowed us to get an overview of multiple users' behavior and organize insights from various angles. If you want to take information sharing a step further, you can also create short, edited clips of the test footage with subtitles, making it easy to share what happened during the session. If time allows, it might be worth putting in the extra effort. In our case, we had only five participants, which made it manageable enough to go that far, but it was still a very time-intensive and demanding process.
From Analysis to Improvement
Normally, we would have a group of people analyze the data and use it to make improvements. However, since this was more of a test run to validate the testing process itself, I simply wrapped up my own observations into a report and left it there for the time being. Gather stakeholders in a number sufficient to hold a discussion, exchange opinions, and consider the matter. By going through this kind of process, I believe it becomes possible to move forward with improvements based on a shared understanding. Everything written here has been preparation for reaching that point.
Lastly
In this article, I wrote about a test we conducted on a small section of our website, which is just one part of the entire service. Even for testing such a limited scope, a great deal of time and preparation was required. But I believe that these small, steady efforts accumulate and ultimately lead to a better experience for our customers. To keep up with the changing world and our customers' needs, we hope to do our best to support the website' growth. I'd be glad if any part of this content is helpful for those planning to run their own usability tests.
関連記事 | Related Posts
We are hiring!
【QAエンジニア】QAG/東京・大阪・福岡
QAグループについて QAグループでは、自社サービスである『KINTO』サービスサイトをはじめ、提供する各種サービスにおいて、リリース前の品質保証、およびサービス品質の向上に向けたQA業務を行なっております。QAグループはまだ成⾧途中の組織ですが、テスト管理ツールの導入や自動化の一部導入など、QAプロセスの最適化に向けて、積極的な取り組みを行っています。
【UI/UXデザイナー】クリエイティブ室/東京・大阪
クリエイティブ室についてKINTOやトヨタが抱えている課題やサービスの状況に応じて、色々なプロジェクトが発生しそれにクリエイティブ力で応えるグループです。所属しているメンバーはそれぞれ異なる技術や経験を持っているので、クリエイティブの側面からサービスの改善案を出し、周りを巻き込みながらプロジェクトを進めています。