-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Checked multiple different ways of reading from R2
| Method | Initial Speed | Failure Point | Error Type |
|---|---|---|---|
| S3 SDK | ~0.1 MB/s | Never completes | Exit code -1 |
| Presigned URL | 13.46 MB/s | ~14MB | Socket closed |
| Worker Stream | 22-24 MB/s | ~10MB | ECONNRESET |
| FUSE (tigrisfs) | N/A | ~5-10MB | Context deadline exceeded |
Container network egress to R2 seems to be unreliable.
All four methods—including the officially documented FUSE approach—fail with the same pattern: initial success followed by connection death after ~10-15MB.
Not sure what I am doing wrong.
Workaround Attempts
We also tried:
- Increasing timeouts (2min → 5min) - Still fails
- Adding heartbeats - Connection still dies
- Using streaming with progress logging - Confirmed data stops flowing...
More details:
.## Environment
- Container Instance Type: standard-4
- enableInternet: true
- File Size Being Transferred: ~30MB -100MB (audio file for transcription)
- Use Case: Audio transcription pipeline using FFmpeg + Whisper API
Failure Pattern (Consistent Across All Methods)
- Transfer starts successfully at good speed (13-24 MB/s)
- After ~10-15MB transferred, connection stalls
- After 2-3 minutes of stalling, connection dies
- Error indicates timeout or connection reset
Method 1: Container fetches from R2 using AWS S3 SDK
Approach: Container uses @aws-sdk/client-s3 to fetch directly from R2.
Container Code (clients.js)
const { S3Client } = require('@aws-sdk/client-s3');
const s3 = new S3Client({
endpoint: process.env.R2_ENDPOINT,
credentials: {
accessKeyId: process.env.AWS_ACCESS_KEY_ID,
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY,
},
region: 'auto',
});
...### Container Code (transcribe.js)
const { GetObjectCommand } = require('@aws-sdk/client-s3');
const { Body } = await s3.send(new GetObjectCommand({
Bucket: BUCKET,
Key: r2_key,
}));
const chunks = [];
let downloadedBytes = 0;
for await (const chunk of Body) {
chunks.push(chunk);
downloadedBytes += chunk.length;
}
const audioBuffer = Buffer.concat(chunks);
...### Result
- Speed: ~0.1 MB/s (extremely slow)
- Error: Container exit code -1 (platform kill)
- Log: Download stalls, container terminated by platform
Method 2: Container fetches via Presigned URL
Approach: Worker generates presigned URL, container uses fetch() to download.
Worker Code (audio.ts)
import { getSignedUrl } from '@aws-sdk/s3-request-presigner';
import { S3Client, GetObjectCommand } from '@aws-sdk/client-s3';
const s3Client = new S3Client({
endpoint: https://${env.CLOUDFLARE_ACCOUNT_ID}.r2.cloudflarestorage.com,
credentials: {
accessKeyId: env.R2_ACCESS_KEY_ID,
secretAccessKey: env.R2_SECRET_ACCESS_KEY,
},
region: 'auto',
});
const presignedUrl = await getSignedUrl(
s3Client,
new GetObjectCommand({
Bucket: 'llm-mindmap-audio',
Key: r2_key,
}),
{ expiresIn: 3600 }
);
// Pass to container
const response = await container.fetch('http://container/transcribe', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ r2_key, presigned_url: presignedUrl }),
});
...### Container Code (transcribe.js)
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 300000); // 5 min
const response = await fetch(presigned_url, { signal: controller.signal });
const reader = response.body.getReader();
const chunks = [];
let downloadedBytes = 0;
while (true) {
const { done, value } = await reader.read();
if (done) break;
chunks.push(value);
downloadedBytes += value.length;
}
...### Result
- Initial Speed: 13.46 MB/s (good!)
- Failure Point: ~14MB transferred
- Error: SocketError: other side closed
- Log:
bytesRead: 14876987
remoteAddress: '172.64.66.1'
code: 'UND_ERR_SOCKET'
[cause]: SocketError: other side closed
...---
Method 3: Worker streams R2 body to Container
Approach: Worker downloads from R2 using internal binding (fast), then streams the body to container.
Worker Code (audio.ts)
// Worker downloads from R2 using internal binding (fast, no egress)
const audioObject = await env.AUDIO.get(r2_key);
if (!audioObject) {
throw new Error(Audio file not found in R2: ${r2_key});
}
// Stream the R2 body directly to the container
const transcribeResponse = await container.fetch('http://container/transcribe', {
method: 'POST',
headers: {
'Content-Type': 'application/octet-stream',
'X-R2-Key': r2_key,
'Content-Length': audioObject.size.toString(),
},
body: audioObject.body, // ReadableStream
});
...### Container Code (transcribe.js)
if (req.headers['content-type'] === 'application/octet-stream') {
const expectedSize = parseInt(req.headers['content-length'] || '0', 10);
const chunks = [];
let receivedBytes = 0;
await new Promise((resolve, reject) => {
req.on('data', (chunk) => {
chunks.push(chunk);
receivedBytes += chunk.length;
});
req.on('end', resolve);
req.on('error', reject);
});
audioBuffer = Buffer.concat(chunks);
}
...### Result
- Initial Speed: 22-24 MB/s (excellent!)
- Failure Point: ~10MB transferred
- Error: ECONNRESET: aborted
- Log:
[transcribe] worker_stream progress { bytes: 10485760, speed_mbps: '23.92' }
... 3 minutes later ...
[transcribe] error at stage 'download_progress': Error: aborted
code: 'ECONNRESET'
...---
Method 4: FUSE Mount with tigrisfs
Approach: Container mounts R2 as filesystem using tigrisfs, reads file directly.
tigrisfs reads AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY from env
tigrisfs --endpoint "${R2_ENDPOINT}" -f "${R2_BUCKET_NAME}" /mnt/r2 &
Wait for mount
for i in $(seq 1 30); do
if mountpoint -q /mnt/r2 2>/dev/null; then
echo "R2 mounted successfully"
break
fi
sleep 1
done
exec node server.js
...### Container Code (transcribe.js)
const R2_MOUNT = '/mnt/r2';
const inputPath = path.join(R2_MOUNT, r2_key);
if (!fs.existsSync(inputPath)) {
return res.status(404).json({ error: File not found: ${r2_key} });
}
const stats = fs.statSync(inputPath); // This works!
console.log(Size: ${stats.size} bytes);
// FFmpeg reads from the FUSE mount
const probe = spawnSync('ffprobe', [..., inputPath]); // Times out
...### Result
- Mount: Successful
- File Metadata: Readable (size: 30,926,305 bytes)
- Failure Point: Reading file content at ~5-10MB
- Error: context deadline exceeded (Client.Timeout or context cancellation while reading body)
- Log:
[transcribe] Starting: .../file.mp3, size: 30926305 bytes
Error reading 5242880 +5242880 of .../file.mp3 (attempt 1):
context deadline exceeded (Client.Timeout or context cancellation while reading body)
Error reading 10485760 +5242880 of .../file.mp3 (attempt 1):
context deadline exceeded