haveibeenpwned.com is a service that hosts passwords from data breaches.
On the Pwned Passwords page, you can check whether a password appears in that corpus. The service uses a k-anonymity design, so the full password hash never leaves the client.
A JavaScript application in the browser first calculates the SHA-1 hash and sends only the first five characters to the service. The service responds with all hash suffixes that match that prefix. The browser can then compare the full SHA-1 hash locally. Troy Hunt explains the approach in his blog post about Pwned Passwords and k-anonymity.
The same workflow is available through the public API. In Java 21, the standard library already has everything we need: HttpClient, MessageDigest, and HexFormat. The example below also enables the recommended Add-Padding header.
public static void main(String[] args) throws IOException, InterruptedException {
String password = "123456";
String sha1 = sha1Hex(password);
String prefixHash = sha1.substring(0, 5);
String suffixHash = sha1.substring(5);
HttpRequest request = HttpRequest
.newBuilder(URI.create("https://api.pwnedpasswords.com/range/" + prefixHash))
.header("Add-Padding", "true").header("User-Agent", "pwnd-java-example")
.build();
HttpResponse<String> response = HTTP_CLIENT.send(request,
HttpResponse.BodyHandlers.ofString(StandardCharsets.UTF_8));
if (response.statusCode() != 200) {
throw new IOException("Unexpected HTTP status: " + response.statusCode());
}
for (String line : response.body().split("\\R")) {
if (line.startsWith(suffixHash + ":")) {
System.out
.println("password found, count: " + line.substring(line.indexOf(':') + 1));
return;
}
}
System.out.println("password not found");
}
private static String sha1Hex(String password) {
try {
MessageDigest messageDigest = MessageDigest.getInstance("SHA-1");
byte[] passwordBytes = messageDigest.digest(password.getBytes(StandardCharsets.UTF_8));
return HEX_FORMAT.formatHex(passwordBytes);
}
catch (NoSuchAlgorithmException e) {
throw new IllegalStateException("SHA-1 is not available", e);
}
}
You can find the current API documentation here: https://haveibeenpwned.com/API/v3
If you want to perform checks fully offline, you can mirror the password corpus to your own machine and search it locally. The HIBP documentation points to an official downloader, and the rest of this post shows how to build the same workflow in Java: download the range files, import them into a local database, and query that database.
Download ¶
The offline corpus is exposed through the range API. Every request downloads one five-character prefix. HIBP also publishes an official downloader, but building the pipeline yourself is a good way to understand how the data is structured.
To download the whole corpus, the program sends requests for all five-character hexadecimal prefixes, from 00000 up to FFFFF. That is 16^5 = 1,048,576 requests, so the downloader uses an ExecutorService with a fixed-size thread pool.
public static void main(String[] args) throws IOException {
int numThreads = Math.max(4, Runtime.getRuntime().availableProcessors() * 2);
int totalRanges = 0x100000;
AtomicInteger completedRanges = new AtomicInteger();
Path outputDir = Path.of("./pwned");
Files.createDirectories(outputDir);
try (ExecutorService executor = Executors.newFixedThreadPool(numThreads)) {
for (int i = 0; i < totalRanges; i++) {
String range = getRange(i);
executor.execute(
() -> downloadRange(range, outputDir, completedRanges, totalRanges));
}
}
}
The downloader uses HttpClient, retries transient failures, and stores every response in ./pwned/<hash_prefix>.txt. Padding is intentionally not enabled here because we want only real hash suffixes in the downloaded files.
private static void downloadRange(String hashPrefix, Path outputDir,
AtomicInteger completedRanges, int totalRanges) {
HttpRequest request = HttpRequest.newBuilder(URI.create(RANGE_API + hashPrefix))
.header("User-Agent", "pwnd-java-downloader")
.timeout(Duration.ofSeconds(60)).build();
for (int attempt = 1; attempt <= MAX_RETRIES; attempt++) {
try {
HttpResponse<byte[]> response = HTTP_CLIENT.send(request,
HttpResponse.BodyHandlers.ofByteArray());
if (response.statusCode() != 200) {
throw new IOException("Unexpected HTTP status: " + response.statusCode());
}
Path outputFile = outputDir.resolve(hashPrefix + ".txt");
Files.write(outputFile, response.body(), StandardOpenOption.CREATE,
StandardOpenOption.TRUNCATE_EXISTING, StandardOpenOption.WRITE);
logProgress(completedRanges.incrementAndGet(), totalRanges);
This step takes a while and needs a lot of disk space. Plan for tens of gigabytes for the downloaded files alone.
Import ¶
Each downloaded file contains SHA-1 suffixes. After each suffix comes a colon and the number of times that password has appeared in breach data.
0005AD76BD555C1D6D771DE417A4B87E4B4:10
000A8DAE4228F821FB418F59826079BF368:4
000DD7F2A1C68A35673713783CA390C9E93:873
001E225B908BAC31C56DB04D892E47536E0:6
006BAB7FC3113AA73DE3589630FC08218E7:3
...
The first five characters are missing because they are already encoded in the filename.
Searching these text files directly would be too slow for repeated lookups, so the next step is importing them into a local database.
For this example, I use Xodus, a transactional embedded key/value store from JetBrains.
<dependency>
<groupId>org.jetbrains.xodus</groupId>
<artifactId>xodus-environment</artifactId>
<version>2.0.1</version>
</dependency>
Xodus fits this use case well. The key is the SHA-1 hash, and the value is the breach count.
Xodus stores data in stores inside environments. We only need one environment with one store named passwords, and every operation runs inside a transaction.
The importer first lists and sorts all files in the download directory. The sort order matters because putRight is only valid when keys are inserted in ascending order. The code then opens every file and reads it line by line with Files.newBufferedReader.
private static void importFiles(Environment env, Transaction txn, Path inputDir) {
Store store = env.openStore("passwords", StoreConfig.WITHOUT_DUPLICATES, txn);
List<Path> hashFiles = listAllFiles(inputDir);
long importedEntries = 0L;
int processedFiles = 0;
for (Path inputFile : hashFiles) {
String hashPrefix = fileNameWithoutExtension(inputFile);
try (BufferedReader reader = Files.newBufferedReader(inputFile,
StandardCharsets.US_ASCII)) {
String line;
while ((line = reader.readLine()) != null) {
handleLine(store, txn, hashPrefix, line);
importedEntries++;
if (importedEntries % FLUSH_INTERVAL == 0) {
txn.flush();
System.out.println(
"Processed " + processedFiles + " of " + hashFiles.size() + " files");
}
}
}
catch (IOException e) {
throw new RuntimeException("Unable to read " + inputFile, e);
}
processedFiles++;
}
txn.commit();
}
For each line, the application calls the handleLine method.
static void handleLine(Store store, Transaction txn, String prefix, String line) {
int separator = line.indexOf(':');
String sha1 = prefix + line.substring(0, separator);
int count = Integer.parseInt(line.substring(separator + 1).trim());
store.putRight(txn, new ArrayByteIterable(HEX_FORMAT.parseHex(sha1)),
IntegerBinding.intToCompressedEntry(count));
}
Everything stored in Xodus must be wrapped in a ByteIterable.
The SHA-1 hash is a 40-character hexadecimal string, but storing it as raw bytes saves space: 20 bytes instead of 40 characters. Because the file only contains the 35-character suffix, the importer rebuilds the full hash by prepending the filename prefix before converting the value with HexFormat.
Because the hashes are processed in sorted order, the importer can use store.putRight. That is faster than a regular insert because Xodus does not need to search for the insertion position first.
Make sure you have enough free disk space before you start the import. The resulting database is also large.
Search ¶
The final application checks whether a plaintext password exists in the local database.
First, we need an SHA-1 encoder for the plaintext password. Java already provides that through MessageDigest.
private static byte[] sha1(String password) {
try {
MessageDigest messageDigest = MessageDigest.getInstance("SHA-1");
return messageDigest.digest(password.getBytes(StandardCharsets.UTF_8));
}
catch (NoSuchAlgorithmException e) {
throw new IllegalStateException("SHA-1 is not available", e);
}
}
The lookup method takes the Xodus Environment and the plaintext password. Because this code only reads from the database, it opens a read-only transaction. It hashes the password, queries the passwords store with store.get(), and returns the count when the key exists.
private static Integer haveIBeenPwned(Environment env, String password) {
byte[] passwordBytes = sha1(password);
return env.computeInReadonlyTransaction(txn -> {
Store store = env.openStore("passwords", StoreConfig.WITHOUT_DUPLICATES, txn);
ByteIterable key = new ArrayByteIterable(passwordBytes);
ByteIterable bi = store.get(txn, key);
if (bi != null) {
return IntegerBinding.compressedEntryToInt(bi);
}
return null;
});
}
In main, the program opens the Xodus environment, checks a few sample passwords, and prints the result together with the lookup time.
public static void main(String[] args) {
try (Environment env = Environments.newInstance("./pwned_db")) {
for (String pw : List.of("123456", "password", "654321", "qwerty",
"letmein")) {
long start = System.nanoTime();
Integer count = haveIBeenPwned(env, pw);
if (count != null) {
System.out.println("I have been pwned. Number of occurrences: " + count);
}
else {
System.out.println("Password not found");
}
System.out.println(Duration.ofNanos(System.nanoTime() - start).toMillis() + " ms");
System.out.println();
}
}
}
Once the data is imported, lookups are fast enough for interactive password checks on local hardware.
This gives you a complete offline workflow for the Pwned Passwords corpus in Java: download, import, and search.
You can find the complete source code on GitHub.