Lanzr

aukit.lua

Oct 28th, 2023
66
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 141.84 KB | None | 0 0
  1. --- AUKit: Audio decoding and processing framework for ComputerCraft
  2. --
  3. --- AUKit is a framework designed to simplify the process of loading, modifying,
  4. --- and playing audio files in various formats. It includes support for loading
  5. --- audio from many sources, including PCM, DFPWM, G.711, and ADPCM codecs, as
  6. --- well as WAV, AIFF, AU, and FLAC files. It can also generate audio on-the-fly
  7. --- as tones, noise, or silence.
  8. --
  9. --- AUKit uses a structure called Audio to store information about each audio
  10. --- chunk. An audio object holds the sample rate of the audio, as well as the
  11. --- data for each channel stored as floating-point numbers. Audio objects can
  12. --- hold any number of channels at any sample rate with any duration.
  13. --
  14. --- To obtain an audio object, you can use any of the main functions in the aukit
  15. --- module. These allow loading from various raw codecs or file formats, with
  16. --- data sources as strings, or tables if using a raw codec loader.
  17. --
  18. --- Once the audio is loaded, various basic operations are available. A subset of
  19. --- the string library is available to simplify operations on the audio, and a
  20. --- number of operators (+, *, .., #) are overridden as well. There's also built-
  21. --- in functions for resampling the audio, with nearest-neighbor, linear, cubic,
  22. --- and sinc interpolation available; as well as mixing channels (including down to
  23. --- mono) and combining/splitting channels. Finally, audio objects can be exported
  24. --- back to PCM, DFPWM, or WAV data, allowing changes to be easily stored on disk.
  25. --- The stream function also automatically chunks data for use with a speaker.
  26. --- All of these functions return a new audio object, leaving the original intact.
  27. --
  28. --- There are also a number of effects available for audio. These are contained
  29. --- in the aukit.effects table, and modify the audio passed to them (as well as
  30. --- returning the audio for streamlining). The effects are intended to speed up
  31. --- common operations on audio. More effects may be added in future versions.
  32. --
  33. --- For simple audio playback tasks, the aukit.stream table provides a number of
  34. --- functions that can quickly decode audio for real-time playback. Each function
  35. --- returns an iterator function that can be called multiple times to obtain fully
  36. --- decoded chunks of audio in 8-bit PCM, ready for playback to one or more
  37. --- speakers. The functions decode the data, resample it to 48 kHz (using the
  38. --- default resampling method), apply a low-pass filter to decrease interpolation
  39. --- error, mix to mono if desired, and then return a list of tables with samples
  40. --- in the range [-128, 127], plus the current position of the audio. The
  41. --- iterators can be passed directly to the aukit.play function, which complements
  42. --- the aukit.stream suite by playing the decoded audio on speakers while decoding
  43. --- it in real-time, handling synchronization of speakers as best as possible.
  44. --
  45. --- If you're really lazy, you can also call `aukit` as a function, which takes
  46. --- the path to a file, and plays this on all available speakers.
  47. --
  48. --- Be aware that processing large amounts of audio (especially loading FLAC or
  49. --- resampling with higher quality) is *very* slow. It's recommended to use audio
  50. --- files with lower data size (8-bit mono PCM/WAV/AIFF is ideal), and potentially
  51. --- a lower sample rate, to reduce the load on the system - especially as all
  52. --- data gets converted to 8-bit DFPWM data on playback anyway. The code yields
  53. --- internally when things take a long time to avoid abort timeouts.
  54. --
  55. --- For an example of how to use AUKit, see the accompanying auplay.lua file.
  56. --
  57. ---@author JackMacWindows
  58. ---@license MIT
  59. --
  60. --- <style>#content {width: unset !important;}</style>
  61. --
  62. ---@module aukit
  63. ---@set project=AUKit
  64.  
  65. --- MIT License
  66. --
  67. --- Copyright (c) 2021-2023 JackMacWindows
  68. --
  69. --- Permission is hereby granted, free of charge, to any person obtaining a copy
  70. --- of this software and associated documentation files (the "Software"), to deal
  71. --- in the Software without restriction, including without limitation the rights
  72. --- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  73. --- copies of the Software, and to permit persons to whom the Software is
  74. --- furnished to do so, subject to the following conditions:
  75. --
  76. --- The above copyright notice and this permission notice shall be included in all
  77. --- copies or substantial portions of the Software.
  78. --
  79. --- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
  80. --- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
  81. --- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
  82. --- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
  83. --- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
  84. --- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
  85. --- SOFTWARE.
  86.  
  87. local expect = require "cc.expect"
  88. local dfpwm = require "cc.audio.dfpwm"
  89.  
  90. local bit32_band, bit32_bxor, bit32_lshift, bit32_rshift, bit32_btest, bit32_extract = bit32.band, bit32.bxor, bit32.lshift, bit32.rshift, bit32.btest, bit32.extract
  91. local math_floor, math_ceil, math_sin, math_abs, math_fmod, math_min, math_max, math_pi = math.floor, math.ceil, math.sin, math.abs, math.fmod, math.min, math.max, math.pi
  92. local os_epoch, os_queueEvent, os_pullEvent = os.epoch, os.queueEvent, os.pullEvent
  93. local str_pack, str_unpack, str_sub, str_byte, str_rep = string.pack, string.unpack, string.sub, string.byte, string.rep
  94. local table_pack, table_unpack, table_insert, table_remove = table.pack, table.unpack, table.insert, table.remove
  95.  
  96. local aukit = setmetatable({}, {__call = function(aukit, path)
  97. expect(1, path, "string")
  98. local file = assert(fs.open(path, "rb"))
  99. local type = aukit.detect(file.read(64)) or "dfpwm"
  100. file.seek("set", 0)
  101. aukit.play(aukit.stream[type](function() return file.read(48000) end), peripheral.find("speaker"))
  102. file.close()
  103. end})
  104. aukit.effects, aukit.stream = {}, {}
  105.  
  106. ---@tfield string _VERSION The version of AUKit that is loaded. This follows [SemVer](https://semver.org) format.
  107. aukit._VERSION = "1.7.0"
  108.  
  109. ---@tfield "none"|"linear"|"cubic"|"sinc" defaultInterpolation Default interpolation mode for @{Audio:resample} and other functions that need to resample.
  110. aukit.defaultInterpolation = "linear"
  111.  
  112. local Audio = {}
  113. local Audio_mt
  114.  
  115. local dfpwmUUID = "3ac1fa38-811d-4361-a40d-ce53ca607cd1" -- UUID for DFPWM in WAV files
  116.  
  117. local function uuidBytes(uuid) return uuid:gsub("-", ""):gsub("%x%x", function(c) return string.char(tonumber(c, 16)) end) end
  118.  
  119. local sincWindowSize = jit and 30 or 10
  120.  
  121. local wavExtensible = {
  122. dfpwm = uuidBytes(dfpwmUUID),
  123. pcm = uuidBytes "01000000-0000-1000-8000-00aa00389b71",
  124. msadpcm = uuidBytes "02000000-0000-1000-8000-00aa00389b71",
  125. alaw = uuidBytes "06000000-0000-1000-8000-00aa00389b71",
  126. ulaw = uuidBytes "07000000-0000-1000-8000-00aa00389b71",
  127. adpcm = uuidBytes "11000000-0000-1000-8000-00aa00389b71",
  128. pcm_float = uuidBytes "03000000-0000-1000-8000-00aa00389b71"
  129. }
  130.  
  131. local wavExtensibleChannels = {
  132. 0x04,
  133. 0x03,
  134. 0x07,
  135. 0x33,
  136. 0x37,
  137. 0x3F,
  138. 0x637,
  139. 0x63F,
  140. 0x50F7,
  141. 0x50FF,
  142. 0x56F7,
  143. 0x56FF
  144. }
  145.  
  146. local ima_index_table = {
  147. [0] = -1, -1, -1, -1, 2, 4, 6, 8,
  148. -1, -1, -1, -1, 2, 4, 6, 8
  149. }
  150.  
  151. local ima_step_table = {
  152. [0] = 7, 8, 9, 10, 11, 12, 13, 14, 16, 17,
  153. 19, 21, 23, 25, 28, 31, 34, 37, 41, 45,
  154. 50, 55, 60, 66, 73, 80, 88, 97, 107, 118,
  155. 130, 143, 157, 173, 190, 209, 230, 253, 279, 307,
  156. 337, 371, 408, 449, 494, 544, 598, 658, 724, 796,
  157. 876, 963, 1060, 1166, 1282, 1411, 1552, 1707, 1878, 2066,
  158. 2272, 2499, 2749, 3024, 3327, 3660, 4026, 4428, 4871, 5358,
  159. 5894, 6484, 7132, 7845, 8630, 9493, 10442, 11487, 12635, 13899,
  160. 15289, 16818, 18500, 20350, 22385, 24623, 27086, 29794, 32767
  161. }
  162.  
  163. local msadpcm_adaption_table = {
  164. [0] = 230, 230, 230, 230, 307, 409, 512, 614,
  165. [-8] = 768, [-7] = 614, [-6] = 512, [-5] = 409, [-4] = 307, [-3] = 230, [-2] = 230, [-1] = 230
  166. }
  167.  
  168. local flacMetadata = {
  169. tracknumber = "trackNumber",
  170. ["encoded-by"] = "encodedBy",
  171. sourcemedia = "sourceMedia",
  172. labelno = "labelNumber",
  173. discnumber = "discNumber",
  174. partnumber = "partNumber",
  175. productnumber = "productNumber",
  176. catalognumber = "catalogNumber",
  177. ["release date"] = "releaseDate",
  178. ["source medium"] = "sourceMedium",
  179. ["source artist"] = "sourceArtist",
  180. ["guest artist"] = "guestArtist",
  181. ["source work"] = "sourceWork",
  182. disctotal = "discCount",
  183. tracktotal = "trackCount",
  184. parttotal = "partCount",
  185. tcm = "composer"
  186. }
  187.  
  188. local wavMetadata = {
  189. IPRD = "album",
  190. INAM = "title",
  191. IART = "artist",
  192. IWRI = "author",
  193. IMUS = "composer",
  194. IPRO = "producer",
  195. IPRT = "trackNumber",
  196. ITRK = "trackNumber",
  197. IFRM = "trackCount",
  198. PRT1 = "partNumber",
  199. PRT2 = "partCount",
  200. TLEN = "length",
  201. IRTD = "rating",
  202. ICRD = "date",
  203. ITCH = "encodedBy",
  204. ISFT = "encoder",
  205. ISRF = "media",
  206. IGNR = "genre",
  207. ICMT = "comment",
  208. ICOP = "copyright",
  209. ILNG = "language"
  210. }
  211.  
  212. local function utf8decode(str, pos)
  213. local codes = {utf8.codepoint(str, 1, -1)}
  214. for i, v in ipairs(codes) do if v > 0xFF then codes[i] = 0x3F end end
  215. return string.char(table_unpack(codes)), pos
  216. end
  217.  
  218. local function clamp(n, min, max)
  219. if n < min then return min
  220. elseif n > max then return max
  221. else return n end
  222. end
  223.  
  224. local function expectAudio(n, var)
  225. if type(var) == "table" and getmetatable(var) == Audio_mt then return var end
  226. expect(n, var, "Audio") -- always fails
  227. end
  228.  
  229. local function copy(tab)
  230. local t = {}
  231. for k, v in pairs(tab) do t[k] = v end
  232. return t
  233. end
  234.  
  235. local function intunpack(str, pos, sz, signed, be)
  236. local n = 0
  237. if be then for i = 0, sz - 1 do n = n * 256 + str_byte(str, pos+i) end
  238. else for i = 0, sz - 1 do n = n + str_byte(str, pos+i) * 2^(8*i) end end
  239. if signed and n >= 2^(sz*8-1) then n = n - 2^(sz*8) end
  240. return n, pos + sz
  241. end
  242.  
  243. local interpolate = {
  244. none = function(data, x)
  245. return data[math_floor(x)]
  246. end,
  247. linear = function(data, x)
  248. local ffx = math_floor(x)
  249. return data[ffx] + ((data[ffx+1] or data[ffx]) - data[ffx]) * (x - ffx)
  250. end,
  251. cubic = function(data, x)
  252. local ffx = math_floor(x)
  253. local p0, p1, p2, p3, fx = data[ffx-1], data[ffx], data[ffx+1], data[ffx+2], x - ffx
  254. p0, p2, p3 = p0 or p1, p2 or p1, p3 or p2 or p1
  255. return (-0.5*p0 + 1.5*p1 - 1.5*p2 + 0.5*p3)*fx^3 + (p0 - 2.5*p1 + 2*p2 - 0.5*p3)*fx^2 + (-0.5*p0 + 0.5*p2)*fx + p1
  256. end,
  257. sinc = function(data, x)
  258. local ffx = math_floor(x)
  259. local fx = x - ffx
  260. local sum = 0
  261. for n = -sincWindowSize, sincWindowSize do
  262. local idx = ffx+n
  263. local d = data[idx]
  264. if d then
  265. local px = math_pi * (fx - n)
  266. if px == 0 then sum = sum + d
  267. else sum = sum + d * math_sin(px) / px end
  268. end
  269. end
  270. return sum
  271. end
  272. }
  273. local interpolation_start = {none = 1, linear = 1, cubic = 0, sinc = 0}
  274. local interpolation_end = {none = 1, linear = 2, cubic = 3, sinc = 0}
  275.  
  276. local wavegen = {
  277. sine = function(x, freq, amplitude)
  278. return math_sin(2 * x * math_pi * freq) * amplitude
  279. end,
  280. triangle = function(x, freq, amplitude)
  281. return 2.0 * math_abs(amplitude * math_fmod(2.0 * x * freq + 1.5, 2.0) - amplitude) - amplitude
  282. end,
  283. square = function(x, freq, amplitude, duty)
  284. if (x * freq) % 1 >= duty then return -amplitude else return amplitude end
  285. end,
  286. sawtooth = function(x, freq, amplitude)
  287. return amplitude * math_fmod(2.0 * x * freq + 1.0, 2.0) - amplitude
  288. end
  289. }
  290.  
  291. --[[
  292. .########.##..........###.....######.
  293. .##.......##.........##.##...##....##
  294. .##.......##........##...##..##......
  295. .######...##.......##.....##.##......
  296. .##.......##.......#########.##......
  297. .##.......##.......##.....##.##....##
  298. .##.......########.##.....##..######.
  299. ]]
  300.  
  301. local decodeFLAC do
  302.  
  303. -- Simple FLAC decoder (Java)
  304. --
  305. -- Copyright (c) 2017 Project Nayuki. (MIT License)
  306. -- https://www.nayuki.io/page/simple-flac-implementation
  307. --
  308. -- Permission is hereby granted, free of charge, to any person obtaining a copy of
  309. -- this software and associated documentation files (the "Software"), to deal in
  310. -- the Software without restriction, including without limitation the rights to
  311. -- use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
  312. -- the Software, and to permit persons to whom the Software is furnished to do so,
  313. -- subject to the following conditions:
  314. -- - The above copyright notice and this permission notice shall be included in
  315. -- all copies or substantial portions of the Software.
  316. -- - The Software is provided "as is", without warranty of any kind, express or
  317. -- implied, including but not limited to the warranties of merchantability,
  318. -- fitness for a particular purpose and noninfringement. In no event shall the
  319. -- authors or copyright holders be liable for any claim, damages or other
  320. -- liability, whether in an action of contract, tort or otherwise, arising from,
  321. -- out of or in connection with the Software or the use or other dealings in the
  322. -- Software.
  323.  
  324. local FIXED_PREDICTION_COEFFICIENTS = {
  325. {},
  326. {1},
  327. {2, -1},
  328. {3, -3, 1},
  329. {4, -6, 4, -1},
  330. };
  331.  
  332. local function BitInputStream(data, pos)
  333. local obj = {}
  334. local bitBuffer, bitBufferLen = 0, 0
  335. function obj.alignToByte()
  336. bitBufferLen = bitBufferLen - bitBufferLen % 8
  337. end
  338. function obj.readByte()
  339. return obj.readUint(8)
  340. end
  341. function obj.readUint(n)
  342. if n == 0 then return 0 end
  343. while bitBufferLen < n do
  344. local temp = str_byte(data, pos)
  345. pos = pos + 1
  346. if temp == nil then return nil end
  347. bitBuffer = (bitBuffer * 256 + temp) % 0x100000000000
  348. bitBufferLen = bitBufferLen + 8
  349. end
  350. bitBufferLen = bitBufferLen - n
  351. local result = math_floor(bitBuffer / 2^bitBufferLen)
  352. if n < 32 then result = result % 2^n end
  353. return result
  354. end
  355. function obj.readSignedInt(n)
  356. local v = obj.readUint(n)
  357. if v >= 2^(n-1) then v = v - 2^n end
  358. return v
  359. end
  360. function obj.readRiceSignedInt(param)
  361. local val = 0
  362. while (obj.readUint(1) == 0) do val = val + 1 end
  363. val = val * 2^param + obj.readUint(param)
  364. if bit32_btest(val, 1) then return -math_floor(val / 2) - 1
  365. else return math_floor(val / 2) end
  366. end
  367. return obj
  368. end
  369.  
  370. local function decodeResiduals(inp, warmup, blockSize, result)
  371. local method = inp.readUint(2);
  372. if (method >= 2) then error("Reserved residual coding method " .. method) end
  373. local paramBits = method == 0 and 4 or 5;
  374. local escapeParam = method == 0 and 0xF or 0x1F;
  375.  
  376. local partitionOrder = inp.readUint(4);
  377. local numPartitions = 2^partitionOrder;
  378. if (blockSize % numPartitions ~= 0) then
  379. error("Block size not divisible by number of Rice partitions")
  380. end
  381. local partitionSize = math_floor(blockSize / numPartitions);
  382.  
  383. for i = 0, numPartitions-1 do
  384. local start = i * partitionSize + (i == 0 and warmup or 0);
  385. local endd = (i + 1) * partitionSize;
  386.  
  387. local param = inp.readUint(paramBits);
  388. if (param < escapeParam) then
  389. for j = start, endd - 1 do
  390. result[j+1] = inp.readRiceSignedInt(param)
  391. end
  392. else
  393. local numBits = inp.readUint(5);
  394. for j = start, endd - 1 do
  395. result[j+1] = inp.readSignedInt(numBits)
  396. end
  397. end
  398. end
  399. end
  400.  
  401. local function restoreLinearPrediction(result, coefs, shift, blockSize)
  402. for i = #coefs, blockSize - 1 do
  403. local sum = 0
  404. for j = 0, #coefs - 1 do
  405. sum = sum + result[i - j] * coefs[j + 1]
  406. end
  407. result[i + 1] = result[i + 1] + math_floor(sum / 2^shift)
  408. end
  409. end
  410.  
  411. local function decodeFixedPredictionSubframe(inp, predOrder, sampleDepth, blockSize, result)
  412. for i = 1, predOrder do
  413. result[i] = inp.readSignedInt(sampleDepth);
  414. end
  415. decodeResiduals(inp, predOrder, blockSize, result);
  416. restoreLinearPrediction(result, FIXED_PREDICTION_COEFFICIENTS[predOrder+1], 0, blockSize);
  417. end
  418.  
  419. local function decodeLinearPredictiveCodingSubframe(inp, lpcOrder, sampleDepth, blockSize, result)
  420. for i = 1, lpcOrder do
  421. result[i] = inp.readSignedInt(sampleDepth);
  422. end
  423. local precision = inp.readUint(4) + 1;
  424. local shift = inp.readSignedInt(5);
  425. local coefs = {};
  426. for i = 1, lpcOrder do
  427. coefs[i] = inp.readSignedInt(precision);
  428. end
  429. decodeResiduals(inp, lpcOrder, blockSize, result);
  430. restoreLinearPrediction(result, coefs, shift, blockSize);
  431. end
  432.  
  433. local function decodeSubframe(inp, sampleDepth, blockSize, result)
  434. inp.readUint(1);
  435. local type = inp.readUint(6);
  436. local shift = inp.readUint(1);
  437. if (shift == 1) then
  438. while (inp.readUint(1) == 0) do shift = shift + 1 end
  439. end
  440. sampleDepth = sampleDepth - shift
  441.  
  442. if (type == 0) then -- Constant coding
  443. local c = inp.readSignedInt(sampleDepth)
  444. for i = 1, blockSize do result[i] = c end
  445. elseif (type == 1) then -- Verbatim coding
  446. for i = 1, blockSize do
  447. result[i] = inp.readSignedInt(sampleDepth);
  448. end
  449. elseif (8 <= type and type <= 12) then
  450. decodeFixedPredictionSubframe(inp, type - 8, sampleDepth, blockSize, result)
  451. elseif (32 <= type and type <= 63) then
  452. decodeLinearPredictiveCodingSubframe(inp, type - 31, sampleDepth, blockSize, result)
  453. else
  454. error("Reserved subframe type")
  455. end
  456.  
  457. for i = 1, blockSize do
  458. result[i] = result[i] * 2^shift
  459. end
  460. end
  461.  
  462. local function decodeSubframes(inp, sampleDepth, chanAsgn, blockSize, result)
  463. local subframes = {}
  464. for i = 1, #result do subframes[i] = {} end
  465. if (0 <= chanAsgn and chanAsgn <= 7) then
  466. for ch = 1, #result do
  467. decodeSubframe(inp, sampleDepth, blockSize, subframes[ch])
  468. end
  469. elseif (8 <= chanAsgn and chanAsgn <= 10) then
  470. decodeSubframe(inp, sampleDepth + (chanAsgn == 9 and 1 or 0), blockSize, subframes[1])
  471. decodeSubframe(inp, sampleDepth + (chanAsgn == 9 and 0 or 1), blockSize, subframes[2])
  472. if (chanAsgn == 8) then
  473. for i = 1, blockSize do
  474. subframes[2][i] = subframes[1][i] - subframes[2][i]
  475. end
  476. elseif (chanAsgn == 9) then
  477. for i = 1, blockSize do
  478. subframes[1][i] = subframes[1][i] + subframes[2][i]
  479. end
  480. elseif (chanAsgn == 10) then
  481. for i = 1, blockSize do
  482. local side = subframes[2][i]
  483. local right = subframes[1][i] - math_floor(side / 2)
  484. subframes[2][i] = right
  485. subframes[1][i] = right + side
  486. end
  487. end
  488. else
  489. error("Reserved channel assignment");
  490. end
  491. for ch = 1, #result do
  492. for i = 1, blockSize do
  493. local s = subframes[ch][i]
  494. if s >= 2^(sampleDepth-1) then s = s - 2^sampleDepth end
  495. result[ch][i] = s / 2^sampleDepth
  496. end
  497. end
  498. end
  499.  
  500. local function decodeFrame(inp, numChannels, sampleDepth, out2, callback)
  501. local out = {}
  502. for i = 1, numChannels do out[i] = {} end
  503. -- Read a ton of header fields, and ignore most of them
  504. local temp = inp.readByte()
  505. if temp == nil then
  506. return false
  507. end
  508. local sync = temp * 64 + inp.readUint(6);
  509. if sync ~= 0x3FFE then error("Sync code expected") end
  510.  
  511. inp.readUint(2);
  512. local blockSizeCode = inp.readUint(4);
  513. local sampleRateCode = inp.readUint(4);
  514. local chanAsgn = inp.readUint(4);
  515. inp.readUint(4);
  516.  
  517. temp = inp.readUint(8);
  518. local t2 = -1
  519. for i = 7, 0, -1 do if not bit32_btest(temp, 2^i) then break end t2 = t2 + 1 end
  520. for i = 1, t2 do inp.readUint(8) end
  521.  
  522. local blockSize
  523. if (blockSizeCode == 1) then
  524. blockSize = 192
  525. elseif (2 <= blockSizeCode and blockSizeCode <= 5) then
  526. blockSize = 576 * 2^(blockSizeCode - 2)
  527. elseif (blockSizeCode == 6) then
  528. blockSize = inp.readUint(8) + 1
  529. elseif (blockSizeCode == 7) then
  530. blockSize = inp.readUint(16) + 1
  531. elseif (8 <= blockSizeCode and blockSizeCode <= 15) then
  532. blockSize = 256 * 2^(blockSizeCode - 8)
  533. else
  534. error("Reserved block size")
  535. end
  536.  
  537. if (sampleRateCode == 12) then
  538. inp.readUint(8)
  539. elseif (sampleRateCode == 13 or sampleRateCode == 14) then
  540. inp.readUint(16)
  541. end
  542.  
  543. inp.readUint(8)
  544.  
  545. decodeSubframes(inp, sampleDepth, chanAsgn, blockSize, out)
  546. inp.alignToByte()
  547. inp.readUint(16)
  548.  
  549. if callback then callback(out) else
  550. for c = 1, numChannels do
  551. local n = #out2[c]
  552. for i = 1, blockSize do out2[c][n+i] = out[c][i] end
  553. end
  554. end
  555.  
  556. return true
  557. end
  558.  
  559. function decodeFLAC(inp, callback)
  560. local out = {}
  561. local pos = 1
  562. -- Handle FLAC header and metadata blocks
  563. local temp temp, pos = intunpack(inp, pos, 4, false, true)
  564. if temp ~= 0x664C6143 then error("Invalid magic string") end
  565. local sampleRate, numChannels, sampleDepth, numSamples
  566. local last = false
  567. local meta = {}
  568. while not last do
  569. temp, pos = str_byte(inp, pos), pos + 1
  570. last = bit32_btest(temp, 0x80)
  571. local type = bit32_band(temp, 0x7F);
  572. local length length, pos = intunpack(inp, pos, 3, false, true)
  573. if type == 0 then -- Stream info block
  574. pos = pos + 10
  575. sampleRate, pos = intunpack(inp, pos, 2, false, true)
  576. sampleRate = sampleRate * 16 + bit32_rshift(str_byte(inp, pos), 4)
  577. numChannels = bit32_band(bit32_rshift(str_byte(inp, pos), 1), 7) + 1;
  578. sampleDepth = bit32_band(str_byte(inp, pos), 1) * 16 + bit32_rshift(str_byte(inp, pos+1), 4) + 1;
  579. numSamples, pos = intunpack(inp, pos + 2, 4, false, true)
  580. numSamples = numSamples + bit32_band(str_byte(inp, pos-5), 15) * 2^32
  581. pos = pos + 16
  582. elseif type == 4 then
  583. local ncomments
  584. meta.vendor, ncomments, pos = str_unpack("<s4I4", inp, pos)
  585. for i = 1, ncomments do
  586. local str
  587. str, pos = utf8decode(str_unpack("<s4", inp, pos))
  588. local k, v = str:match "^([^=]+)=(.*)$"
  589. if k then meta[flacMetadata[k:lower()] or k:lower()] = v end
  590. end
  591. else
  592. pos = pos + length
  593. end
  594. end
  595. if not sampleRate then error("Stream info metadata block absent") end
  596. if sampleDepth % 8 ~= 0 then error("Sample depth not supported") end
  597.  
  598. for i = 1, numChannels do out[i] = {} end
  599.  
  600. if callback then callback(sampleRate, numSamples) end
  601.  
  602. -- Decode FLAC audio frames and write raw samples
  603. inp = BitInputStream(inp, pos)
  604. repeat until not decodeFrame(inp, numChannels, sampleDepth, out, callback)
  605. if not callback then return {sampleRate = sampleRate, data = out, metadata = meta, info = {bitDepth = sampleDepth, dataType = "signed"}} end
  606. end
  607.  
  608. end
  609.  
  610. --[[
  611. ....###....##.....##.########..####..#######.
  612. ...##.##...##.....##.##.....##..##..##.....##
  613. ..##...##..##.....##.##.....##..##..##.....##
  614. .##.....##.##.....##.##.....##..##..##.....##
  615. .#########.##.....##.##.....##..##..##.....##
  616. .##.....##.##.....##.##.....##..##..##.....##
  617. .##.....##..#######..########..####..#######.
  618. ]]
  619.  
  620. --- Audio
  621. ---@section Audio
  622.  
  623. ---@alias Metadata {bitDepth: number|nil, dataType: string|nil}
  624.  
  625. --- The Audio class represents a chunk of audio with variable channels and sample rate.
  626. ---@class Audio
  627. ---@field data number[][] The samples in each channel.
  628. ---@field sampleRate number The sample rate of the audio.
  629. ---@field metadata table Stores any metadata read from the file if present.
  630. ---@field info Metadata Stores any decoder-specific information, including `bitDepth` and `dataType`.
  631.  
  632. ---@tfield number sampleRate The sample rate of the audio.
  633. Audio.sampleRate = nil
  634.  
  635. ---@tfield table metadata Stores any metadata read from the file if present.
  636. Audio.metadata = nil
  637.  
  638. ---@tfield table info Stores any decoder-specific information, including `bitDepth` and `dataType`.
  639. Audio.info = nil
  640.  
  641. --- Returns the length of the audio object in seconds.
  642. ---@return number _ The audio length
  643. function Audio:len()
  644. return #self.data[1] / self.sampleRate
  645. end
  646.  
  647. --- Returns the number of channels in the audio object.
  648. ---@return number _ The number of channels
  649. function Audio:channels()
  650. return #self.data
  651. end
  652.  
  653. --- Creates a new audio object with the data resampled to a different sample rate.
  654. --- If the target rate is the same, the object is copied without modification.
  655. ---@param sampleRate number The new sample rate in Hertz
  656. ---@param interpolation? "none"|"linear"|"cubic" The interpolation mode to use
  657. ---@return Audio _ A new audio object with the resampled data
  658. function Audio:resample(sampleRate, interpolation)
  659. expect(1, sampleRate, "number")
  660. interpolation = expect(2, interpolation, "string", "nil") or aukit.defaultInterpolation
  661. if not interpolate[interpolation] then error("bad argument #2 (invalid interpolation type)", 2) end
  662. local new = setmetatable({sampleRate = sampleRate, data = {}, metadata = copy(self.metadata), info = copy(self.info)}, Audio_mt)
  663. local ratio = sampleRate / self.sampleRate
  664. local newlen = #self.data[1] * ratio
  665. local interp = interpolate[interpolation]
  666. local start = os_epoch "utc"
  667. for y, c in ipairs(self.data) do
  668. local line = {}
  669. for i = 1, newlen do
  670. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  671. local x = (i - 1) / ratio + 1
  672. if x % 1 == 0 then line[i] = c[x]
  673. else line[i] = clamp(interp(c, x), -1, 1) end
  674. end
  675. new.data[y] = line
  676. end
  677. return new
  678. end
  679.  
  680. --- Mixes down all channels to a new mono-channel audio object.
  681. ---@return Audio _ A new audio object with the audio mixed to mono
  682. function Audio:mono()
  683. local new = setmetatable({sampleRate = self.sampleRate, data = {{}}, metadata = copy(self.metadata), info = copy(self.info)}, Audio_mt)
  684. local ndata = new.data[1]
  685. local cn = #self.data
  686. local start = os_epoch "utc"
  687. for i = 1, #self.data[1] do
  688. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  689. local s = 0
  690. for c = 1, cn do s = s + self.data[c][i] end
  691. ndata[i] = s / cn
  692. end
  693. return new
  694. end
  695.  
  696. --- Concatenates this audio object with another, adding the contents of each
  697. --- new channel to the end of each old channel, resampling the new channels to match
  698. --- this one (if necessary), and inserting silence in any missing channels.
  699. ---@param ... Audio The audio objects to concatenate
  700. ---@return Audio _ The new concatenated audio object
  701. function Audio:concat(...)
  702. local audios = {self, ...}
  703. local l = {#self.data[1]}
  704. local cn = #self.data
  705. for i = 2, #audios do
  706. expectAudio(i-1, audios[i])
  707. if audios[i].sampleRate ~= self.sampleRate then audios[i] = audios[i]:resample(self.sampleRate) end
  708. l[i] = #audios[i].data[1]
  709. cn = math_max(cn, #audios[i].data)
  710. end
  711. local obj = setmetatable({sampleRate = self.sampleRate, data = {}, metadata = copy(self.metadata), info = copy(self.info)}, Audio_mt)
  712. for c = 1, cn do
  713. local ch = {}
  714. local pos = 0
  715. for a = 1, #audios do
  716. local sch = audios[a].data[c]
  717. if sch then for i = 1, l[a] do ch[pos+i] = sch[i] end
  718. else for i = 1, l[a] do ch[pos+i] = 0 end end
  719. pos = pos + l[a]
  720. end
  721. obj.data[c] = ch
  722. end
  723. return obj
  724. end
  725.  
  726. --- Takes a subregion of the audio and returns a new audio object with its contents.
  727. --- This takes the same arguments as @{string.sub}, but positions start at 0.
  728. ---@param start? number The start position of the audio in seconds
  729. ---@param last? number The end position of the audio in seconds (0 means end of file)
  730. ---@return Audio _ The new split audio object
  731. function Audio:sub(start, last)
  732. start = math_floor(expect(1, start, "number", "nil") or 0)
  733. last = math_floor(expect(2, last, "number", "nil") or 0)
  734. local len = #self.data[1] / self.sampleRate
  735. if start < 0 then start = len + start end
  736. if last <= 0 then last = len + last end
  737. expect.range(start, 0, len)
  738. expect.range(last, 0, len)
  739. start, last = start * self.sampleRate + 1, last * self.sampleRate + 1
  740. local obj = setmetatable({sampleRate = self.sampleRate, data = {}, metadata = copy(self.metadata), info = copy(self.info)}, Audio_mt)
  741. for c = 1, #self.data do
  742. local ch = {}
  743. local sch = self.data[c]
  744. for i = start, last do ch[i-start+1] = sch[i] end
  745. obj.data[c] = ch
  746. end
  747. return obj
  748. end
  749.  
  750. --- Combines the channels of this audio object with another, adding the new
  751. --- channels on the end of the new object, resampling the new channels to match
  752. --- this one (if necessary), and extending any channels that are shorter than the
  753. --- longest channel with zeroes.
  754. ---@param ... Audio The audio objects to combine with
  755. ---@return Audio _ The new combined audio object
  756. function Audio:combine(...)
  757. local audios = {self, ...}
  758. local len = #self.data[1]
  759. for i = 2, #audios do
  760. expectAudio(i-1, audios[i])
  761. if audios[i].sampleRate ~= self.sampleRate then audios[i] = audios[i]:resample(self.sampleRate) end
  762. len = math_max(len, #audios[i].data[1])
  763. end
  764. local obj = setmetatable({sampleRate = self.sampleRate, data = {}, metadata = copy(self.metadata), info = copy(self.info)}, Audio_mt)
  765. local pos = 0
  766. for a = 1, #audios do
  767. for c = 1, #audios[a].data do
  768. local sch, ch = audios[a].data[c], {}
  769. for i = 1, len do ch[i] = sch[i] or 0 end
  770. obj.data[pos+c] = ch
  771. end
  772. pos = pos + #audios[a].data
  773. end
  774. return obj
  775. end
  776.  
  777. --- Splits this audio object into one or more objects with the specified channels.
  778. --- Passing a channel that doesn't exist will throw an error.
  779. ---@param ... number[] The lists of channels in each new object
  780. ---@return Audio ... The new audio objects created from the channels in each list
  781. ---@usage Split a stereo track into independent mono objects
  782. --
  783. --- local left, right = stereo:split({1}, {2})
  784. function Audio:split(...)
  785. local retval = {}
  786. for n, cl in ipairs{...} do
  787. expect(n, cl, "table")
  788. if #cl == 0 then error("bad argument #" .. n .. " (cannot use empty table)") end
  789. local obj = setmetatable({sampleRate = self.sampleRate, data = {}, metadata = copy(self.metadata), info = copy(self.info)}, Audio_mt)
  790. for cd, cs in ipairs(cl) do
  791. local sch, ch = self.data[expect(cd, cs, "number")], {}
  792. if not sch then error("channel " .. cs .. " (in argument " .. n .. ") out of range", 2) end
  793. for i = 1, #sch do ch[i] = sch[i] end
  794. obj.data[cd] = ch
  795. end
  796. retval[#retval+1] = obj
  797. end
  798. return table_unpack(retval)
  799. end
  800.  
  801. --- Mixes two or more audio objects into a single object, amplifying each sample
  802. --- with a multiplier (before clipping) if desired, and clipping any values
  803. --- outside the audio range ([-1, 1]). Channels that are shorter are padded with
  804. --- zeroes at the end, and non-existent channels are replaced with all zeroes.
  805. --- Any audio objects with a different sample rate are resampled to match this one.
  806. ---@param amplifier number|Audio The multiplier to apply, or the first audio object
  807. ---@param ... Audio The objects to mix with this one
  808. ---@return Audio _ The new mixed audio object
  809. function Audio:mix(amplifier, ...)
  810. local audios = {self, ...}
  811. local len = #self.data[1]
  812. local cn = #self.data
  813. for i = 2, #audios do
  814. expectAudio(i, audios[i])
  815. if audios[i].sampleRate ~= self.sampleRate then audios[i] = audios[i]:resample(self.sampleRate) end
  816. len = math_max(len, #audios[i].data[1])
  817. cn = math_max(cn, #audios[i].data)
  818. end
  819. if type(amplifier) ~= "number" then
  820. expectAudio(1, amplifier)
  821. if amplifier.sampleRate ~= self.sampleRate then amplifier = amplifier:resample(self.sampleRate) end
  822. len = math_max(len, #amplifier.data[1])
  823. cn = math_max(cn, #amplifier.data)
  824. table_insert(audios, 2, amplifier)
  825. amplifier = 1
  826. end
  827. local obj = setmetatable({sampleRate = self.sampleRate, data = {}, metadata = copy(self.metadata), info = copy(self.info)}, Audio_mt)
  828. for c = 1, cn do
  829. local ch = {}
  830. local sch = {}
  831. for a = 1, #audios do sch[a] = audios[a].data[c] end
  832. for i = 1, len do
  833. local s = 0
  834. for a = 1, #audios do if sch[a] then s = s + (sch[a][i] or 0) end end
  835. ch[i] = clamp(s * amplifier, -1, 1)
  836. end
  837. obj.data[c] = ch
  838. end
  839. return obj
  840. end
  841.  
  842. --- Returns a new audio object that repeats this audio a number of times.
  843. ---@param count number The number of times to play the audio
  844. ---@return Audio _ The repeated audio
  845. function Audio:rep(count)
  846. if type(self) ~= "table" and type(count) == "table" then self, count = count, self end
  847. expect(1, count, "number")
  848. local obj = setmetatable({sampleRate = self.sampleRate, data = {}, metadata = copy(self.metadata), info = copy(self.info)}, Audio_mt)
  849. for c = 1, #self.data do
  850. local sch, ch = self.data[c], {}
  851. for n = 0, count - 1 do
  852. local pos = n * #sch
  853. for i = 1, #sch do ch[pos+i] = sch[i] end
  854. end
  855. obj.data[c] = ch
  856. end
  857. return obj
  858. end
  859.  
  860. --- Returns a reversed version of this audio.
  861. ---@return Audio _ The reversed audio
  862. function Audio:reverse()
  863. local obj = setmetatable({sampleRate = self.sampleRate, data = {}, metadata = copy(self.metadata), info = copy(self.info)}, Audio_mt)
  864. for c = 1, #self.data do
  865. local sch, ch = self.data[c], {}
  866. local len = #sch
  867. for i = 1, len do ch[len-i+1] = sch[i] end
  868. obj.data[c] = ch
  869. end
  870. return obj
  871. end
  872.  
  873. local function encodePCM(info, pos)
  874. local maxValue = 2^(info.bitDepth-1)
  875. local add = info.dataType == "unsigned" and maxValue or 0
  876. local source = info.audio.data
  877. local encode
  878. if info.dataType == "float" then encode = function(d) return d end
  879. else encode = function(d) return d * (d < 0 and maxValue or maxValue-1) + add end end
  880. local data = {}
  881. local nc = #source
  882. local len = #source[1]
  883. if pos > len then return nil end
  884. local start = os_epoch "utc"
  885. if info.interleaved then for n = pos, pos + info.len - 1 do if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end for c = 1, nc do data[(n-1)*nc+c] = encode(source[c][n]) end end
  886. elseif info.multiple then
  887. for c = 1, nc do
  888. data[c] = {}
  889. for n = pos, pos + info.len - 1 do
  890. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  891. local s = source[c][n]
  892. if not s then break end
  893. data[c][n-pos+1] = encode(s)
  894. end
  895. end
  896. return pos + info.len, table_unpack(data)
  897. else for c = 1, nc do for n = pos, pos + info.len - 1 do if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end data[(c-1)*len+n] = encode(source[c][n]) end end end
  898. return data
  899. end
  900.  
  901. --- Converts the audio data to raw PCM samples.
  902. ---@param bitDepth? number The bit depth of the audio (8, 16, 24, 32)
  903. ---@param dataType? "signed"|"unsigned"|"float" The type of each sample
  904. ---@param interleaved? boolean Whether to interleave each channel
  905. ---@return number[]|nil ... The resulting audio data
  906. function Audio:pcm(bitDepth, dataType, interleaved)
  907. bitDepth = expect(1, bitDepth, "number", "nil") or 8
  908. dataType = expect(2, dataType, "string", "nil") or "signed"
  909. expect(3, interleaved, "boolean", "nil")
  910. if interleaved == nil then interleaved = true end
  911. if bitDepth ~= 8 and bitDepth ~= 16 and bitDepth ~= 24 and bitDepth ~= 32 then error("bad argument #2 (invalid bit depth)", 2) end
  912. if dataType ~= "signed" and dataType ~= "unsigned" and dataType ~= "float" then error("bad argument #3 (invalid data type)", 2) end
  913. if dataType == "float" and bitDepth ~= 32 then error("bad argument #2 (float audio must have 32-bit depth)", 2) end
  914. return encodePCM({audio = self, bitDepth = bitDepth, dataType = dataType, interleaved = interleaved, len = #self.data[1]}, 1)
  915. end
  916.  
  917. --- Returns a function that can be called to encode PCM samples in chunks.
  918. --- This is useful as a for iterator, and can be used with @{aukit.play}.
  919. ---@param chunkSize? number The size of each chunk
  920. ---@param bitDepth? number The bit depth of the audio (8, 16, 24, 32)
  921. ---@param dataType? "signed"|"unsigned"|"float" The type of each sample
  922. ---@return fun():number[][]|nil,number|nil _ An iterator function that returns
  923. --- chunks of each channel's data as arrays of signed 8-bit 48kHz PCM, as well as
  924. --- the current position of the audio in seconds
  925. ---@return number _ The total length of the audio in seconds
  926. function Audio:stream(chunkSize, bitDepth, dataType)
  927. chunkSize = expect(1, chunkSize, "number", "nil") or 131072
  928. bitDepth = expect(2, bitDepth, "number", "nil") or 8
  929. dataType = expect(3, dataType, "string", "nil") or "signed"
  930. if bitDepth ~= 8 and bitDepth ~= 16 and bitDepth ~= 24 and bitDepth ~= 32 then error("bad argument #2 (invalid bit depth)", 2) end
  931. if dataType ~= "signed" and dataType ~= "unsigned" and dataType ~= "float" then error("bad argument #3 (invalid data type)", 2) end
  932. if dataType == "float" and bitDepth ~= 32 then error("bad argument #2 (float audio must have 32-bit depth)", 2) end
  933. local info, pos = {audio = self, bitDepth = bitDepth, dataType = dataType, interleaved = false, multiple = true, len = chunkSize}, 1
  934. return function()
  935. if info == nil then return nil end
  936. local p = pos / self.sampleRate
  937. local v = {encodePCM(info, pos)}
  938. if v[1] == nil then info = nil return nil end
  939. pos = table_remove(v, 1)
  940. return v, p
  941. end, #self.data[1] / self.sampleRate
  942. end
  943.  
  944. --- Coverts the audio data to a WAV file.
  945. ---@param bitDepth? number The bit depth of the audio (1 = DFPWM, 8, 16, 24, 32)
  946. ---@return string _ The resulting WAV file data
  947. function Audio:wav(bitDepth)
  948. -- TODO: Support float data
  949. bitDepth = expect(1, bitDepth, "number", "nil") or 16
  950. if bitDepth == 1 then
  951. local str = self:dfpwm(true)
  952. return str_pack("<c4Ic4c4IHHIIHHHHIc16c4IIc4I",
  953. "RIFF", #str + 72, "WAVE",
  954. "fmt ", 40, 0xFFFE, #self.data, self.sampleRate, self.sampleRate * #self.data / 8, math_ceil(#self.data / 8), 1,
  955. 22, 1, wavExtensibleChannels[#self.data] or 0, wavExtensible.dfpwm,
  956. "fact", 4, #self.data[1],
  957. "data", #str) .. str
  958. elseif bitDepth ~= 8 and bitDepth ~= 16 and bitDepth ~= 24 and bitDepth ~= 32 then error("bad argument #2 (invalid bit depth)", 2) end
  959. local data = self:pcm(bitDepth, bitDepth == 8 and "unsigned" or "signed", true)
  960. local str = ""
  961. local csize = jit and 7680 or 32768
  962. local format = ((bitDepth == 8 and "I" or "i") .. (bitDepth / 8)):rep(csize)
  963. for i = 1, #data - csize, csize do str = str .. format:pack(table_unpack(data, i, i + csize - 1)) end
  964. str = str .. ((bitDepth == 8 and "I" or "i") .. (bitDepth / 8)):rep(#data % csize):pack(table_unpack(data, math_floor(#data / csize) * csize))
  965. return str_pack("<c4Ic4c4IHHIIHHc4I", "RIFF", #str + 36, "WAVE", "fmt ", 16, 1, #self.data, self.sampleRate, self.sampleRate * #self.data * bitDepth / 8, #self.data * bitDepth / 8, bitDepth, "data", #str) .. str
  966. end
  967.  
  968. --- Converts the audio data to DFPWM. All channels share the same encoder, and
  969. --- channels are stored sequentially uninterleaved if `interleaved` is false, or
  970. --- in one interleaved string if `interleaved` is true.
  971. ---@param interleaved? boolean Whether to interleave the channels
  972. ---@return string ... The resulting DFPWM data for each channel (only one string
  973. --- if `interleaved` is true)
  974. function Audio:dfpwm(interleaved)
  975. expect(1, interleaved, "boolean", "nil")
  976. if interleaved == nil then interleaved = true end
  977. if interleaved then
  978. return dfpwm.encode(self:pcm(8, "signed", true))
  979. else
  980. local channels = {self:pcm(8, "signed", false)}
  981. ---@type fun(samples:number[]):string
  982. local encode = dfpwm.make_encoder()
  983. for i = 1, #channels do channels[i] = encode(channels[i]) end
  984. ---@diagnostic disable-next-line return-type-mismatch
  985. return table_unpack(channels)
  986. end
  987. end
  988.  
  989. Audio_mt = {__index = Audio, __add = Audio.combine, __mul = Audio.rep, __concat = Audio.concat, __len = Audio.len, __name = "Audio"}
  990.  
  991. function Audio_mt:__tostring()
  992. return "Audio: " .. self.sampleRate .. " Hz, " .. #self.data .. " channels, " .. (#self.data[1] / self.sampleRate) .. " seconds"
  993. end
  994.  
  995. --[[
  996. ....###....##.....##.##....##.####.########
  997. ...##.##...##.....##.##...##...##.....##...
  998. ..##...##..##.....##.##..##....##.....##...
  999. .##.....##.##.....##.#####.....##.....##...
  1000. .#########.##.....##.##..##....##.....##...
  1001. .##.....##.##.....##.##...##...##.....##...
  1002. .##.....##..#######..##....##.####....##...
  1003. ]]
  1004.  
  1005. --- aukit
  1006. ---@section aukit
  1007.  
  1008. --- Creates a new audio object from the specified raw PCM data.
  1009. ---@param data string|table The audio data, either as a raw string, or a table
  1010. --- of values (in the format specified by `bitDepth` and `dataType`)
  1011. ---@param bitDepth? number The bit depth of the audio (8, 16, 24, 32); if `dataType` is "float" then this must be 32
  1012. ---@param dataType? "signed"|"unsigned"|"float" The type of each sample
  1013. ---@param channels? number The number of channels present in the audio
  1014. ---@param sampleRate? number The sample rate of the audio in Hertz
  1015. ---@param interleaved? boolean Whether each channel is interleaved or separate
  1016. ---@param bigEndian? boolean Whether the audio is big-endian or little-endian; ignored if data is a table
  1017. ---@return Audio _ A new audio object containing the specified data
  1018. function aukit.pcm(data, bitDepth, dataType, channels, sampleRate, interleaved, bigEndian)
  1019. expect(1, data, "string", "table")
  1020. bitDepth = expect(2, bitDepth, "number", "nil") or 8
  1021. dataType = expect(3, dataType, "string", "nil") or "signed"
  1022. channels = expect(4, channels, "number", "nil") or 1
  1023. sampleRate = expect(5, sampleRate, "number", "nil") or 48000
  1024. expect(6, interleaved, "boolean", "nil")
  1025. if interleaved == nil then interleaved = true end
  1026. expect(7, bigEndian, "boolean", "nil")
  1027. if bitDepth ~= 8 and bitDepth ~= 16 and bitDepth ~= 24 and bitDepth ~= 32 then error("bad argument #2 (invalid bit depth)", 2) end
  1028. if dataType ~= "signed" and dataType ~= "unsigned" and dataType ~= "float" then error("bad argument #3 (invalid data type)", 2) end
  1029. if dataType == "float" and bitDepth ~= 32 then error("bad argument #2 (float audio must have 32-bit depth)", 2) end
  1030. expect.range(channels, 1)
  1031. expect.range(sampleRate, 1)
  1032. local byteDepth = bitDepth / 8
  1033. if (#data / (type(data) == "table" and 1 or byteDepth)) % channels ~= 0 then error("bad argument #1 (uneven amount of data per channel)", 2) end
  1034. local len = (#data / (type(data) == "table" and 1 or byteDepth)) / channels
  1035. local csize = jit and 7680 or 32768
  1036. local csizeb = csize * byteDepth
  1037. local bitDir = bigEndian and ">" or "<"
  1038. local sformat = dataType == "float" and "f" or ((dataType == "signed" and "i" or "I") .. byteDepth)
  1039. local format = bitDir .. str_rep(sformat, csize)
  1040. local maxValue = 2^(bitDepth-1)
  1041. local obj = setmetatable({sampleRate = sampleRate, data = {}, metadata = {}, info = {bitDepth = bitDepth, dataType = dataType}}, Audio_mt)
  1042. for i = 1, channels do obj.data[i] = {} end
  1043. local pos, spos = 1, 1
  1044. local tmp = {}
  1045. local read
  1046. if type(data) == "table" then
  1047. if dataType == "signed" then
  1048. function read()
  1049. local s = data[pos]
  1050. pos = pos + 1
  1051. return s / (s < 0 and maxValue or maxValue-1)
  1052. end
  1053. elseif dataType == "unsigned" then
  1054. function read()
  1055. local s = data[pos]
  1056. pos = pos + 1
  1057. return (s - 128) / (s < 128 and maxValue or maxValue-1)
  1058. end
  1059. else
  1060. function read()
  1061. local s = data[pos]
  1062. pos = pos + 1
  1063. return s
  1064. end
  1065. end
  1066. elseif dataType == "float" then
  1067. function read()
  1068. if pos > #tmp then
  1069. if spos + csizeb > #data then
  1070. local f = bitDir .. str_rep(sformat, (#data - spos + 1) / byteDepth)
  1071. tmp = {str_unpack(f, data, spos)}
  1072. spos = tmp[#tmp]
  1073. tmp[#tmp] = nil
  1074. else
  1075. tmp = {str_unpack(format, data, spos)}
  1076. spos = tmp[#tmp]
  1077. tmp[#tmp] = nil
  1078. end
  1079. pos = 1
  1080. end
  1081. local s = tmp[pos]
  1082. pos = pos + 1
  1083. return s
  1084. end
  1085. elseif dataType == "signed" then
  1086. function read()
  1087. if pos > #tmp then
  1088. if spos + csizeb > #data then
  1089. local f = bitDir .. str_rep(sformat, (#data - spos + 1) / byteDepth)
  1090. tmp = {str_unpack(f, data, spos)}
  1091. spos = tmp[#tmp]
  1092. tmp[#tmp] = nil
  1093. else
  1094. tmp = {str_unpack(format, data, spos)}
  1095. spos = tmp[#tmp]
  1096. tmp[#tmp] = nil
  1097. end
  1098. pos = 1
  1099. end
  1100. local s = tmp[pos]
  1101. pos = pos + 1
  1102. return s / (s < 0 and maxValue or maxValue-1)
  1103. end
  1104. else -- unsigned
  1105. function read()
  1106. if pos > #tmp then
  1107. if spos + csizeb > #data then
  1108. local f = bitDir .. str_rep(sformat, (#data - spos + 1) / byteDepth)
  1109. tmp = {str_unpack(f, data, spos)}
  1110. spos = tmp[#tmp]
  1111. tmp[#tmp] = nil
  1112. else
  1113. tmp = {str_unpack(format, data, spos)}
  1114. spos = tmp[#tmp]
  1115. tmp[#tmp] = nil
  1116. end
  1117. pos = 1
  1118. end
  1119. local s = tmp[pos]
  1120. pos = pos + 1
  1121. return (s - 128) / (s < 128 and maxValue or maxValue-1)
  1122. end
  1123. end
  1124. local start = os_epoch "utc"
  1125. if interleaved and channels > 1 then
  1126. local d = obj.data
  1127. for i = 1, len do
  1128. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  1129. for j = 1, channels do d[j][i] = read() end
  1130. end
  1131. else for j = 1, channels do
  1132. local line = {}
  1133. obj.data[j] = line
  1134. for i = 1, len do
  1135. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  1136. line[i] = read()
  1137. end
  1138. end end
  1139. return obj
  1140. end
  1141.  
  1142. --- Creates a new audio object from IMA ADPCM data.
  1143. ---@param data string|table The audio data, either as a raw string, or a table of nibbles
  1144. ---@param channels? number The number of channels present in the audio
  1145. ---@param sampleRate? number The sample rate of the audio in Hertz
  1146. ---@param topFirst? boolean Whether the top nibble is the first nibble
  1147. --- (true) or last (false); ignored if `data` is a table
  1148. ---@param interleaved? boolean Whether each channel is interleaved or separate
  1149. ---@param predictor? number|table The initial predictor value(s)
  1150. ---@param step_index? number|table The initial step index(es)
  1151. ---@return Audio _ A new audio object containing the decoded data
  1152. function aukit.adpcm(data, channels, sampleRate, topFirst, interleaved, predictor, step_index)
  1153. expect(1, data, "string", "table")
  1154. channels = expect(2, channels, "number", "nil") or 1
  1155. sampleRate = expect(3, sampleRate, "number", "nil") or 48000
  1156. expect(4, topFirst, "boolean", "nil")
  1157. if topFirst == nil then topFirst = true end
  1158. expect(5, interleaved, "boolean", "nil")
  1159. if interleaved == nil then interleaved = true end
  1160. predictor = expect(6, predictor, "number", "table", "nil")
  1161. step_index = expect(7, step_index, "number", "table", "nil")
  1162. expect.range(channels, 1)
  1163. expect.range(sampleRate, 1)
  1164. if predictor == nil then
  1165. predictor = {}
  1166. for i = 1, channels do predictor[i] = 0 end
  1167. elseif type(predictor) == "number" then
  1168. if channels ~= 1 then error("bad argument #6 (table too short)", 2) end
  1169. predictor = {expect.range(predictor, -32768, 32767)}
  1170. else
  1171. if channels > #predictor then error("bad argument #6 (table too short)", 2) end
  1172. for i = 1, channels do expect.range(predictor[i], -32768, 32767) end
  1173. end
  1174. if step_index == nil then
  1175. step_index = {}
  1176. for i = 1, channels do step_index[i] = 0 end
  1177. elseif type(step_index) == "number" then
  1178. if channels ~= 1 then error("bad argument #7 (table too short)", 2) end
  1179. step_index = {expect.range(step_index, 0, 88)}
  1180. else
  1181. if channels > #step_index then error("bad argument #7 (table too short)", 2) end
  1182. for i = 1, channels do expect.range(step_index[i], 0, 88) end
  1183. end
  1184. local pos = 1
  1185. local read, tmp, len
  1186. if type(data) == "string" then
  1187. function read()
  1188. if tmp then
  1189. local v = tmp
  1190. tmp = nil
  1191. return v
  1192. else
  1193. local b = str_byte(data, pos)
  1194. pos = pos + 1
  1195. if topFirst then tmp, b = bit32_band(b, 0x0F), bit32_rshift(b, 4)
  1196. else tmp, b = bit32_rshift(b, 4), bit32_band(b, 0x0F) end
  1197. return b
  1198. end
  1199. end
  1200. len = math_floor(#data * 2 / channels)
  1201. else
  1202. function read()
  1203. local v = data[pos]
  1204. pos = pos + 1
  1205. return v
  1206. end
  1207. len = #data / channels
  1208. end
  1209. local obj = setmetatable({sampleRate = sampleRate, data = {}, metadata = {}, info = {bitDepth = 16, dataType = "signed"}}, Audio_mt)
  1210. local step = {}
  1211. local start = os_epoch "utc"
  1212. if interleaved then
  1213. local d = obj.data
  1214. for j = 1, channels do d[j] = {} end
  1215. for i = 1, len do
  1216. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  1217. for j = 1, channels do
  1218. local nibble = read()
  1219. step[j] = ima_step_table[step_index[j]]
  1220. step_index[j] = clamp(step_index[j] + ima_index_table[nibble], 0, 88)
  1221. local diff = bit32_rshift((nibble % 8) * step[j], 2) + bit32_rshift(step[j], 3)
  1222. if nibble >= 8 then predictor[j] = clamp(predictor[j] - diff, -32768, 32767)
  1223. else predictor[j] = clamp(predictor[j] + diff, -32768, 32767) end
  1224. d[j][i] = predictor[j] / (predictor[j] < 0 and 32768 or 32767)
  1225. end
  1226. end
  1227. else for j = 1, channels do
  1228. local line = {}
  1229. local predictor, step_index, step = predictor[j], step_index[j], nil
  1230. for i = 1, len do
  1231. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  1232. local nibble = read()
  1233. step = ima_step_table[step_index]
  1234. step_index = clamp(step_index + ima_index_table[nibble], 0, 88)
  1235. local diff = bit32_rshift((nibble % 8) * step, 2) + bit32_rshift(step, 3)
  1236. if nibble >= 8 then predictor = clamp(predictor - diff, -32768, 32767)
  1237. else predictor = clamp(predictor + diff, -32768, 32767) end
  1238. line[i] = predictor / (predictor < 0 and 32768 or 32767)
  1239. end
  1240. obj.data[j] = line
  1241. end end
  1242. return obj
  1243. end
  1244.  
  1245. --- Creates a new audio object from Microsoft ADPCM data.
  1246. ---@param data string The audio data as a raw string
  1247. ---@param blockAlign number The number of bytes in each block
  1248. ---@param channels? number The number of channels present in the audio
  1249. ---@param sampleRate? number The sample rate of the audio in Hertz
  1250. ---@param coefficients? table Two lists of coefficients to use
  1251. ---@return Audio _ A new audio object containing the decoded data
  1252. function aukit.msadpcm(data, blockAlign, channels, sampleRate, coefficients)
  1253. expect(1, data, "string")
  1254. expect(2, blockAlign, "number")
  1255. channels = expect(3, channels, "number", "nil") or 1
  1256. sampleRate = expect(4, sampleRate, "number", "nil") or 48000
  1257. expect(5, coefficients, "table", "nil")
  1258. expect.range(sampleRate, 1)
  1259. local coeff1, coeff2
  1260. if coefficients then
  1261. if type(coefficients[1]) ~= "table" then error("bad argument #5 (first entry is not a table)", 2) end
  1262. if type(coefficients[2]) ~= "table" then error("bad argument #5 (second entry is not a table)", 2) end
  1263. if #coefficients[1] ~= #coefficients[2] then error("bad argument #5 (lists are not the same length)", 2) end
  1264. coeff1, coeff2 = {}, {}
  1265. for i, v in ipairs(coefficients[1]) do
  1266. if type(v) ~= "number" then error("bad entry #" .. i .. " in coefficient list 1 (expected number, got " .. type(v) .. ")", 2) end
  1267. coeff1[i-1] = v
  1268. end
  1269. for i, v in ipairs(coefficients[2]) do
  1270. if type(v) ~= "number" then error("bad entry #" .. i .. " in coefficient list 2 (expected number, got " .. type(v) .. ")", 2) end
  1271. coeff2[i-1] = v
  1272. end
  1273. else coeff1, coeff2 = {[0] = 256, 512, 0, 192, 240, 460, 392}, {[0] = 0, -256, 0, 64, 0, -208, -232} end
  1274. local obj = setmetatable({sampleRate = sampleRate, data = {{}, channels == 2 and {} or nil}, metadata = {}, info = {bitDepth = 16, dataType = "signed"}}, Audio_mt)
  1275. local left, right = obj.data[1], obj.data[2]
  1276. local start = os_epoch "utc"
  1277. for n = 1, #data, blockAlign do
  1278. if channels == 2 then
  1279. local predictorIndexL, predictorIndexR, deltaL, deltaR, sample1L, sample1R, sample2L, sample2R = str_unpack("<BBhhhhhh", data, n)
  1280. local c1L, c2L, c1R, c2R = coeff1[predictorIndexL], coeff2[predictorIndexL], coeff1[predictorIndexR], coeff2[predictorIndexR]
  1281. left[#left+1] = sample2L / (sample2L < 0 and 32768 or 32767)
  1282. left[#left+1] = sample1L / (sample1L < 0 and 32768 or 32767)
  1283. right[#right+1] = sample2R / (sample2R < 0 and 32768 or 32767)
  1284. right[#right+1] = sample1R / (sample1R < 0 and 32768 or 32767)
  1285. for i = 14, blockAlign - 1 do
  1286. local b = str_byte(data, n+i)
  1287. local hi, lo = bit32_rshift(b, 4), bit32_band(b, 0x0F)
  1288. if hi >= 8 then hi = hi - 16 end
  1289. if lo >= 8 then lo = lo - 16 end
  1290. local predictor = clamp(math_floor((sample1L * c1L + sample2L * c2L) / 256) + hi * deltaL, -32768, 32767)
  1291. left[#left+1] = predictor / (predictor < 0 and 32768 or 32767)
  1292. sample2L, sample1L = sample1L, predictor
  1293. deltaL = math_max(math_floor(msadpcm_adaption_table[hi] * deltaL / 256), 16)
  1294. predictor = clamp(math_floor((sample1R * c1R + sample2R * c2R) / 256) + lo * deltaR, -32768, 32767)
  1295. right[#right+1] = predictor / (predictor < 0 and 32768 or 32767)
  1296. sample2R, sample1R = sample1R, predictor
  1297. deltaR = math_max(math_floor(msadpcm_adaption_table[lo] * deltaR / 256), 16)
  1298. end
  1299. elseif channels == 1 then
  1300. local predictorIndex, delta, sample1, sample2 = str_unpack("<!1Bhhh", data)
  1301. local c1, c2 = coeff1[predictorIndex], coeff2[predictorIndex]
  1302. left[#left+1] = sample2 / (sample2 < 0 and 32768 or 32767)
  1303. left[#left+1] = sample1 / (sample1 < 0 and 32768 or 32767)
  1304. for i = 7, blockAlign - 1 do
  1305. local b = str_byte(data, n+i)
  1306. local hi, lo = bit32_rshift(b, 4), bit32_band(b, 0x0F)
  1307. if hi >= 8 then hi = hi - 16 end
  1308. if lo >= 8 then lo = lo - 16 end
  1309. local predictor = clamp(math_floor((sample1 * c1 + sample2 * c2) / 256) + hi * delta, -32768, 32767)
  1310. left[#left+1] = predictor / (predictor < 0 and 32768 or 32767)
  1311. sample2, sample1 = sample1, predictor
  1312. delta = math_max(math_floor(msadpcm_adaption_table[hi] * delta / 256), 16)
  1313. predictor = clamp(math_floor((sample1 * c1 + sample2 * c2) / 256) + lo * delta, -32768, 32767)
  1314. left[#left+1] = predictor / (predictor < 0 and 32768 or 32767)
  1315. sample2, sample1 = sample1, predictor
  1316. delta = math_max(math_floor(msadpcm_adaption_table[lo] * delta / 256), 16)
  1317. end
  1318. else error("Unsupported number of channels: " .. channels) end
  1319. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  1320. end
  1321. return obj
  1322. end
  1323.  
  1324. --- Creates a new audio object from G.711 u-law/A-law data.
  1325. ---@param data string The audio data as a raw string
  1326. ---@param ulaw boolean Whether the audio uses u-law (true) or A-law (false).
  1327. ---@param channels? number The number of channels present in the audio
  1328. ---@param sampleRate? number The sample rate of the audio in Hertz
  1329. ---@return Audio _ A new audio object containing the decoded data
  1330. function aukit.g711(data, ulaw, channels, sampleRate)
  1331. expect(1, data, "string")
  1332. expect(2, ulaw, "boolean")
  1333. channels = expect(3, channels, "number", "nil") or 1
  1334. sampleRate = expect(4, sampleRate, "number", "nil") or 8000
  1335. local retval = {}
  1336. local csize = jit and 7680 or 32768
  1337. local xor = ulaw and 0xFF or 0x55
  1338. for i = 1, channels do retval[i] = {} end
  1339. local start = os_epoch "utc"
  1340. for i = 1, #data, csize do
  1341. local bytes = {str_byte(data, i, i + csize - 1)}
  1342. for j = 1, #bytes do
  1343. local b = bit32_bxor(bytes[j], xor)
  1344. local m, e = bit32_band(b, 0x0F), bit32_extract(b, 4, 3)
  1345. if not ulaw and e == 0 then m = m * 4 + 2
  1346. else m = bit32_lshift(m * 2 + 33, e) end
  1347. if ulaw then m = m - 33 end
  1348. retval[(i+j-2) % channels + 1][math_floor((i+j-2) / channels + 1)] = m / (bit32_btest(b, 0x80) == ulaw and -0x2000 or 0x2000)
  1349. end
  1350. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  1351. end
  1352. return setmetatable({sampleRate = sampleRate, data = retval, metadata = {bitDepth = ulaw and 14 or 13, dataType = "signed"}, info = {}}, Audio_mt)
  1353. end
  1354.  
  1355. --- Creates a new audio object from DFPWM1a data. All channels are expected to
  1356. --- share the same decoder, and are stored interleaved in a single stream.
  1357. ---@param data string The audio data as a raw string
  1358. ---@param channels? number The number of channels present in the audio
  1359. ---@param sampleRate? number The sample rate of the audio in Hertz
  1360. ---@return Audio _ A new audio object containing the decoded data
  1361. function aukit.dfpwm(data, channels, sampleRate)
  1362. expect(1, data, "string")
  1363. channels = expect(2, channels, "number", "nil") or 1
  1364. sampleRate = expect(3, sampleRate, "number", "nil") or 48000
  1365. expect.range(channels, 1)
  1366. expect.range(sampleRate, 1)
  1367. local audio = {}
  1368. local decoder = dfpwm.make_decoder()
  1369. local pos = 1
  1370. local last = 0
  1371. local start = os_epoch "utc"
  1372. while pos <= #data do
  1373. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  1374. local temp = decoder(str_sub(data, pos, pos + 6000))
  1375. if temp == nil or #temp == 0 then break end
  1376. for i=1,#temp do
  1377. audio[last+i] = temp[i]
  1378. end
  1379. last = last + #temp
  1380. pos = pos + 6000
  1381. end
  1382. return aukit.pcm(audio, 8, "signed", channels, sampleRate, true, false)
  1383. end
  1384.  
  1385. --- Creates a new audio object from a WAV file. This accepts PCM files up to 32
  1386. --- bits, including float data, as well as DFPWM files [as specified here](https://gist.github.com/MCJack123/90c24b64c8e626c7f130b57e9800962c),
  1387. --- plus IMA and Microsoft ADPCM formats.
  1388. ---@param data string The WAV data to load
  1389. ---@return Audio _ A new audio object with the contents of the WAV file
  1390. function aukit.wav(data)
  1391. expect(1, data, "string")
  1392. local channels, sampleRate, bitDepth, length, dataType, blockAlign, coefficients
  1393. local temp, pos = str_unpack("c4", data)
  1394. if temp ~= "RIFF" then error("bad argument #1 (not a WAV file)", 2) end
  1395. pos = pos + 4
  1396. temp, pos = str_unpack("c4", data, pos)
  1397. if temp ~= "WAVE" then error("bad argument #1 (not a WAV file)", 2) end
  1398. local meta = {}
  1399. while pos <= #data do
  1400. local size
  1401. temp, size, pos = str_unpack("<c4I", data, pos)
  1402. if temp == "fmt " then
  1403. local chunk = str_sub(data, pos, pos + size - 1)
  1404. pos = pos + size
  1405. local format
  1406. format, channels, sampleRate, blockAlign, bitDepth = str_unpack("<HHIxxxxHH", chunk)
  1407. if format == 1 then
  1408. dataType = bitDepth == 8 and "unsigned" or "signed"
  1409. elseif format == 2 then
  1410. dataType = "msadpcm"
  1411. local numcoeff = str_unpack("<H", chunk, 21)
  1412. if numcoeff > 0 then
  1413. coefficients = {{}, {}}
  1414. for i = 1, numcoeff do
  1415. coefficients[1][i], coefficients[2][i] = str_unpack("<hh", chunk, i * 4 + 19)
  1416. end
  1417. end
  1418. elseif format == 3 then
  1419. dataType = "float"
  1420. elseif format == 6 then
  1421. dataType = "alaw"
  1422. elseif format == 7 then
  1423. dataType = "ulaw"
  1424. elseif format == 0x11 then
  1425. dataType = "adpcm"
  1426. elseif format == 0xFFFE then
  1427. bitDepth = str_unpack("<H", chunk, 19)
  1428. local uuid = str_sub(chunk, 25, 40)
  1429. if uuid == wavExtensible.pcm then dataType = bitDepth == 8 and "unsigned" or "signed"
  1430. elseif uuid == wavExtensible.dfpwm then dataType = "dfpwm"
  1431. elseif uuid == wavExtensible.msadpcm then dataType = "msadpcm"
  1432. elseif uuid == wavExtensible.pcm_float then dataType = "float"
  1433. elseif uuid == wavExtensible.alaw then dataType = "alaw"
  1434. elseif uuid == wavExtensible.ulaw then dataType = "ulaw"
  1435. elseif uuid == wavExtensible.adpcm then dataType = "adpcm"
  1436. else error("unsupported WAV file", 2) end
  1437. else error("unsupported WAV file", 2) end
  1438. elseif temp == "data" then
  1439. local data = str_sub(data, pos, pos + size - 1)
  1440. if #data < size then error("invalid WAV file", 2) end
  1441. local obj
  1442. if dataType == "adpcm" then
  1443. local blocks = {}
  1444. for n = 1, #data, blockAlign do
  1445. if channels == 2 then
  1446. local predictorL, indexL, predictorR, indexR = str_unpack("<hBxhB", data, n)
  1447. local nibbles = {}
  1448. for i = 8, blockAlign - 1, 8 do
  1449. local b = str_byte(data, n+i)
  1450. nibbles[(i-7)*2-1] = bit32_band(b, 0x0F)
  1451. nibbles[(i-6)*2-1] = bit32_rshift(b, 4)
  1452. b = str_byte(data, n+i+1)
  1453. nibbles[(i-5)*2-1] = bit32_band(b, 0x0F)
  1454. nibbles[(i-4)*2-1] = bit32_rshift(b, 4)
  1455. b = str_byte(data, n+i+2)
  1456. nibbles[(i-3)*2-1] = bit32_band(b, 0x0F)
  1457. nibbles[(i-2)*2-1] = bit32_rshift(b, 4)
  1458. b = str_byte(data, n+i+3)
  1459. nibbles[(i-1)*2-1] = bit32_band(b, 0x0F)
  1460. nibbles[i*2-1] = bit32_rshift(b, 4)
  1461. b = str_byte(data, n+i+4)
  1462. nibbles[(i-7)*2] = bit32_band(b, 0x0F)
  1463. nibbles[(i-6)*2] = bit32_rshift(b, 4)
  1464. b = str_byte(data, n+i+5)
  1465. nibbles[(i-5)*2] = bit32_band(b, 0x0F)
  1466. nibbles[(i-4)*2] = bit32_rshift(b, 4)
  1467. b = str_byte(data, n+i+6)
  1468. nibbles[(i-3)*2] = bit32_band(b, 0x0F)
  1469. nibbles[(i-2)*2] = bit32_rshift(b, 4)
  1470. b = str_byte(data, n+i+7)
  1471. nibbles[(i-1)*2] = bit32_band(b, 0x0F)
  1472. nibbles[i*2] = bit32_rshift(b, 4)
  1473. end
  1474. blocks[#blocks+1] = aukit.adpcm(nibbles, channels, sampleRate, false, true, {predictorL, predictorR}, {indexL, indexR})
  1475. else
  1476. local predictor, index = str_unpack("<hB", data, n)
  1477. index = bit32_band(index, 0x0F)
  1478. blocks[#blocks+1] = aukit.adpcm(str_sub(data, n + 4, n + blockAlign - 1), channels, sampleRate, false, false, predictor, index)
  1479. end
  1480. end
  1481. obj = blocks[1]:concat(table_unpack(blocks, 2))
  1482. elseif dataType == "msadpcm" then obj = aukit.msadpcm(data, blockAlign, channels, sampleRate, coefficients)
  1483. elseif dataType == "alaw" or dataType == "ulaw" then obj = aukit.g711(data, dataType == "ulaw", channels, sampleRate)
  1484. elseif dataType == "dfpwm" then obj = aukit.dfpwm(data, channels, sampleRate)
  1485. else obj = aukit.pcm(data, bitDepth, dataType, channels, sampleRate, true, false) end
  1486. obj.metadata = meta
  1487. obj.info = {dataType = dataType, bitDepth = bitDepth}
  1488. return obj
  1489. elseif temp == "fact" then
  1490. -- TODO
  1491. pos = pos + size
  1492. elseif temp == "LIST" then
  1493. local type = str_unpack("c4", data, pos)
  1494. if type == "INFO" then
  1495. local e = pos + size
  1496. pos = pos + 4
  1497. while pos < e do
  1498. local str
  1499. type, str, pos = str_unpack("!2<c4s4Xh", data, pos)
  1500. if wavMetadata[type] then meta[wavMetadata[type]] = tonumber(str) or str end
  1501. end
  1502. else pos = pos + size end
  1503. else pos = pos + size end
  1504. end
  1505. error("invalid WAV file", 2)
  1506. end
  1507.  
  1508. --- Creates a new audio object from an AIFF or AIFC file.
  1509. ---@param data string The AIFF data to load
  1510. ---@return Audio _ A new audio object with the contents of the AIFF file
  1511. function aukit.aiff(data)
  1512. expect(1, data, "string")
  1513. local channels, sampleRate, bitDepth, length, offset, compression, blockAlign
  1514. local isAIFC = false
  1515. local temp, pos = str_unpack("c4", data)
  1516. if temp ~= "FORM" then error("bad argument #1 (not an AIFF file)", 2) end
  1517. pos = pos + 4
  1518. temp, pos = str_unpack("c4", data, pos)
  1519. if temp == "AIFC" then isAIFC = true
  1520. elseif temp ~= "AIFF" then error("bad argument #1 (not an AIFF file)", 2) end
  1521. local meta = {}
  1522. while pos <= #data do
  1523. local size
  1524. temp, size, pos = str_unpack(">c4I", data, pos)
  1525. if temp == "COMM" then
  1526. local e, m
  1527. channels, length, bitDepth, e, m, pos = str_unpack(">hIhHI7x", data, pos)
  1528. if isAIFC then
  1529. local s
  1530. compression, s, pos = str_unpack(">c4s1", data, pos)
  1531. if #s % 2 == 0 then pos = pos + 1 end
  1532. end
  1533. length = length * channels * math_floor(bitDepth / 8)
  1534. local s = bit32_btest(e, 0x8000)
  1535. e = ((bit32_band(e, 0x7FFF) - 0x3FFE) % 0x800)
  1536. sampleRate = math.ldexp(m * (s and -1 or 1) / 0x100000000000000, e)
  1537. elseif temp == "SSND" then
  1538. offset, blockAlign, pos = str_unpack(">II", data, pos)
  1539. local data = str_sub(data, pos + offset, pos + offset + length - 1)
  1540. if #data < length then error("invalid AIFF file", 2) end
  1541. local obj
  1542. if compression == nil or compression == "NONE" then obj = aukit.pcm(data, bitDepth, "signed", channels, sampleRate, true, true)
  1543. elseif compression == "sowt" then obj = aukit.pcm(data, bitDepth, "signed", channels, sampleRate, true, false)
  1544. elseif compression == "fl32" or compression == "FL32" then obj = aukit.pcm(data, 32, "float", channels, sampleRate, true, true)
  1545. elseif compression == "alaw" or compression == "ulaw" or compression == "ALAW" or compression == "ULAW" then obj = aukit.g711(data, compression == "ulaw" or compression == "ULAW", channels, sampleRate)
  1546. else error("Unsupported compression scheme " .. compression, 2) end
  1547. obj.metadata = meta
  1548. return obj
  1549. elseif temp == "NAME" then
  1550. meta.title = str_sub(data, pos, pos + size - 1)
  1551. pos = pos + size
  1552. elseif temp == "AUTH" then
  1553. meta.artist = str_sub(data, pos, pos + size - 1)
  1554. pos = pos + size
  1555. elseif temp == "(c) " then
  1556. meta.copyright = str_sub(data, pos, pos + size - 1)
  1557. pos = pos + size
  1558. elseif temp == "ANNO" then
  1559. meta.comment = str_sub(data, pos, pos + size - 1)
  1560. pos = pos + size
  1561. else pos = pos + size end
  1562. end
  1563. error("invalid AIFF file", 2)
  1564. end
  1565.  
  1566. --- Creates a new audio object from an AU file.
  1567. ---@param data string The AU data to load
  1568. ---@return Audio _ A new audio object with the contents of the AU file
  1569. function aukit.au(data)
  1570. expect(1, data, "string")
  1571. local magic, offset, size, encoding, sampleRate, channels = str_unpack(">c4IIIII", data)
  1572. if magic ~= ".snd" then error("invalid AU file", 2) end
  1573. if encoding == 1 then return aukit.g711(str_sub(data, offset, size ~= 0xFFFFFFFF and offset + size - 1 or nil), true, channels, sampleRate)
  1574. elseif encoding == 2 then return aukit.pcm(str_sub(data, offset, size ~= 0xFFFFFFFF and offset + size - 1 or nil), 8, "signed", channels, sampleRate, true, true)
  1575. elseif encoding == 3 then return aukit.pcm(str_sub(data, offset, size ~= 0xFFFFFFFF and offset + size - 1 or nil), 16, "signed", channels, sampleRate, true, true)
  1576. elseif encoding == 4 then return aukit.pcm(str_sub(data, offset, size ~= 0xFFFFFFFF and offset + size - 1 or nil), 24, "signed", channels, sampleRate, true, true)
  1577. elseif encoding == 5 then return aukit.pcm(str_sub(data, offset, size ~= 0xFFFFFFFF and offset + size - 1 or nil), 32, "signed", channels, sampleRate, true, true)
  1578. elseif encoding == 6 then return aukit.pcm(str_sub(data, offset, size ~= 0xFFFFFFFF and offset + size - 1 or nil), 32, "float", channels, sampleRate, true, true)
  1579. elseif encoding == 27 then return aukit.g711(str_sub(data, offset, size ~= 0xFFFFFFFF and offset + size - 1 or nil), false, channels, sampleRate)
  1580. else error("unsupported encoding type " .. encoding, 2) end
  1581. end
  1582.  
  1583. --- Creates a new audio object from a FLAC file.
  1584. ---@param data string The FLAC data to load
  1585. ---@return Audio _ A new audio object with the contents of the FLAC file
  1586. function aukit.flac(data)
  1587. expect(1, data, "string")
  1588. return setmetatable(decodeFLAC(data), Audio_mt)
  1589. end
  1590.  
  1591. --- Creates a new empty audio object with the specified duration.
  1592. ---@param duration number The length of the audio in seconds
  1593. ---@param channels? number The number of channels present in the audio
  1594. ---@param sampleRate? number The sample rate of the audio in Hertz
  1595. ---@return Audio _ The new empty audio object
  1596. function aukit.new(duration, channels, sampleRate)
  1597. expect(1, duration, "number")
  1598. channels = expect(2, channels, "number", "nil") or 1
  1599. sampleRate = expect(3, sampleRate, "number", "nil") or 48000
  1600. expect.range(channels, 1)
  1601. expect.range(sampleRate, 1)
  1602. local obj = setmetatable({sampleRate = sampleRate, data = {}, metadata = {}, info = {}}, Audio_mt)
  1603. for c = 1, channels do
  1604. local l = {}
  1605. for i = 1, duration * sampleRate do l[i] = 0 end
  1606. obj.data[c] = l
  1607. end
  1608. return obj
  1609. end
  1610.  
  1611. --- Creates a new audio object with a tone of the specified frequency and duration.
  1612. ---@param frequency number The frequency of the tone in Hertz
  1613. ---@param duration number The length of the audio in seconds
  1614. ---@param amplitude? number The amplitude of the audio from 0.0 to 1.0
  1615. ---@param waveType? "sine"|"triangle"|"sawtooth"|"square" The type of wave to generate
  1616. ---@param duty? number The duty cycle of the square wave if selected; ignored otherwise
  1617. ---@param channels? number The number of channels present in the audio
  1618. ---@param sampleRate? number The sample rate of the audio in Hertz
  1619. ---@return Audio _ A new audio object with the tone
  1620. function aukit.tone(frequency, duration, amplitude, waveType, duty, channels, sampleRate)
  1621. expect(1, frequency, "number")
  1622. expect(2, duration, "number")
  1623. amplitude = expect(3, amplitude, "number", "nil") or 1
  1624. waveType = expect(4, waveType, "string", "nil") or "sine"
  1625. duty = expect(5, duty, "number", "nil") or 0.5
  1626. channels = expect(6, channels, "number", "nil") or 1
  1627. sampleRate = expect(7, sampleRate, "number", "nil") or 48000
  1628. expect.range(amplitude, 0, 1)
  1629. local f = wavegen[waveType]
  1630. if not f then error("bad argument #4 (invalid wave type)", 2) end
  1631. expect.range(duty, 0, 1)
  1632. expect.range(channels, 1)
  1633. expect.range(sampleRate, 1)
  1634. local obj = setmetatable({sampleRate = sampleRate, data = {}, metadata = {}, info = {}}, Audio_mt)
  1635. for c = 1, channels do
  1636. local l = {}
  1637. for i = 1, duration * sampleRate do l[i] = f(i / sampleRate, frequency, amplitude, duty) end
  1638. obj.data[c] = l
  1639. end
  1640. return obj
  1641. end
  1642.  
  1643. --- Creates a new audio object with white noise for the specified duration.
  1644. ---@param duration number The length of the audio in seconds
  1645. ---@param amplitude? number The amplitude of the audio from 0.0 to 1.0
  1646. ---@param channels? number The number of channels present in the audio
  1647. ---@param sampleRate? number The sample rate of the audio in Hertz
  1648. ---@return Audio _ A new audio object with noise
  1649. function aukit.noise(duration, amplitude, channels, sampleRate)
  1650. expect(1, duration, "number")
  1651. amplitude = expect(2, amplitude, "number", "nil") or 1
  1652. channels = expect(3, channels, "number", "nil") or 1
  1653. sampleRate = expect(4, sampleRate, "number", "nil") or 48000
  1654. expect.range(amplitude, 0, 1)
  1655. expect.range(channels, 1)
  1656. expect.range(sampleRate, 1)
  1657. local obj = setmetatable({sampleRate = sampleRate, data = {}, metadata = {}, info = {}}, Audio_mt)
  1658. local random = math.random
  1659. for c = 1, channels do
  1660. local l = {}
  1661. for i = 1, duration * sampleRate do l[i] = (random() * 2 - 1) * amplitude end
  1662. obj.data[c] = l
  1663. end
  1664. return obj
  1665. end
  1666.  
  1667. --- Packs a table with PCM data into a string using the specified data type.
  1668. ---@param data number[] The PCM data to pack
  1669. ---@param bitDepth? number The bit depth of the audio (8, 16, 24, 32); if `dataType` is "float" then this must be 32
  1670. ---@param dataType? "signed"|"unsigned"|"float" The type of each sample
  1671. ---@param bigEndian? boolean Whether the data should be big-endian or little-endian
  1672. ---@return string _ The packed PCM data
  1673. function aukit.pack(data, bitDepth, dataType, bigEndian)
  1674. expect(1, data, "string", "table")
  1675. bitDepth = expect(2, bitDepth, "number", "nil") or 8
  1676. dataType = expect(3, dataType, "string", "nil") or "signed"
  1677. expect(4, bigEndian, "boolean", "nil")
  1678. if bitDepth ~= 8 and bitDepth ~= 16 and bitDepth ~= 24 and bitDepth ~= 32 then error("bad argument #2 (invalid bit depth)", 2) end
  1679. if dataType ~= "signed" and dataType ~= "unsigned" and dataType ~= "float" then error("bad argument #3 (invalid data type)", 2) end
  1680. if dataType == "float" and bitDepth ~= 32 then error("bad argument #2 (float audio must have 32-bit depth)", 2) end
  1681. local byteDepth = bitDepth / 8
  1682. local format = (bigEndian and ">" or "<") .. (dataType == "float" and "f" or ((dataType == "signed" and "i" or "I") .. byteDepth))
  1683. local formatChunk = str_sub(format, 1, 1) .. str_sub(format, 2):rep(512)
  1684. local retval = ""
  1685. for i = 1, #data, 512 do
  1686. if #data < i + 512 then retval = retval .. str_pack(str_rep(format, #data % 512), table_unpack(data, i, #data))
  1687. else retval = retval .. str_pack(formatChunk, table_unpack(data, i, i+511)) end
  1688. end
  1689. return retval
  1690. end
  1691.  
  1692. ---@alias speaker {playAudio: fun(samples: number[], volume?: number)}
  1693.  
  1694. --- Plays back stream functions created by one of the @{aukit.stream} functions
  1695. --- or @{Audio:stream}.
  1696. ---@param callback function():number[][] The iterator function that returns each chunk
  1697. ---@param progress? function(pos:number) A callback to report progress to
  1698. --- the caller; if omitted then this argument is the first speaker
  1699. ---@param volume? number The volume to play the audio at; if omitted then
  1700. --- this argument is the second speaker (if provided)
  1701. ---@param ... speaker The speakers to play on
  1702. function aukit.play(callback, progress, volume, ...)
  1703. expect(1, callback, "function")
  1704. expect(2, progress, "function", "table")
  1705. expect(3, volume, "number", "table", "nil")
  1706. local speakers = {...}
  1707. if type(volume) == "table" then
  1708. table_insert(speakers, 1, volume)
  1709. volume = nil
  1710. end
  1711. if type(progress) == "table" then
  1712. table_insert(speakers, 1, progress)
  1713. progress = nil
  1714. end
  1715. if #speakers == 0 then error("bad argument #2 (expected speakers, got nil)", 2) end
  1716. local chunks = {}
  1717. local complete = false
  1718. local a, b = coroutine.create(function()
  1719. for chunk, pos in callback do chunks[#chunks+1] = {chunk, pos} coroutine.yield(speakers) end
  1720. complete = true
  1721. end), coroutine.create(function()
  1722. while not complete or #chunks > 0 do
  1723. while not chunks[1] do if complete then return end coroutine.yield(speakers) end
  1724. local pchunk = table_remove(chunks, 1)
  1725. local fn = {}
  1726. if progress then progress(pchunk[2]) end
  1727. pchunk = pchunk[1]
  1728. local chunklist = {}
  1729. if #pchunk[1] < 96000 then chunklist = {pchunk}
  1730. else
  1731. for i = 0, #pchunk[1] - 1, 48000 do
  1732. local chunk = {}
  1733. chunklist[#chunklist+1] = chunk
  1734. for j = 1, #pchunk do
  1735. local s, c = pchunk[j], {}
  1736. chunk[j] = c
  1737. for k = 1, 48000 do c[k] = s[k+i] end
  1738. end
  1739. end
  1740. end
  1741. for _, chunk in ipairs(chunklist) do
  1742. for i, v in ipairs(speakers) do fn[i] = function()
  1743. local name = peripheral.getName(v)
  1744. if _HOST:find("CraftOS-PC v2.6.4") and config and not config.get("standardsMode") then
  1745. v.playAudio(chunk[i] or chunk[1], volume)
  1746. repeat until select(2, os_pullEvent("speaker_audio_empty")) == name
  1747. else while not v.playAudio(chunk[i] or chunk[1], volume) do
  1748. repeat until select(2, os_pullEvent("speaker_audio_empty")) == name
  1749. end end
  1750. end end
  1751. parallel.waitForAll(table_unpack(fn))
  1752. end
  1753. end
  1754. end)
  1755. local ok, af, bf
  1756. local aq, bq = {{}}, {{}}
  1757. repeat
  1758. if #aq > 0 then
  1759. local event = table_remove(aq, 1)
  1760. if af == speakers then
  1761. af = nil
  1762. table_insert(aq, 1, event)
  1763. end
  1764. if af == nil or event[1] == af then
  1765. ok, af = coroutine.resume(a, table_unpack(event, 1, event.n))
  1766. if not ok then error(af, 2) end
  1767. end
  1768. end
  1769. if #bq > 0 then
  1770. local event = table_remove(bq, 1)
  1771. if bf == speakers then
  1772. bf = nil
  1773. table_insert(bq, 1, event)
  1774. end
  1775. if bf == nil or event[1] == bf then
  1776. ok, bf = coroutine.resume(b, table_unpack(event, 1, event.n))
  1777. if not ok then error(bf, 2) end
  1778. end
  1779. end
  1780. if coroutine.status(b) == "suspended" and (#aq == 0 or #bq == 0) then
  1781. if af ~= nil and bf ~= nil then
  1782. local event = table_pack(os_pullEvent())
  1783. aq[#aq+1] = event
  1784. bq[#bq+1] = event
  1785. else
  1786. os_queueEvent("__queue_end")
  1787. while true do
  1788. local event = table_pack(os_pullEvent())
  1789. if event[1] == "__queue_end" then break end
  1790. aq[#aq+1] = event
  1791. bq[#bq+1] = event
  1792. end
  1793. end
  1794. end
  1795. until coroutine.status(b) == "dead" or complete
  1796. while coroutine.status(b) == "suspended" and #bq > 0 do
  1797. local event = table_remove(bq, 1)
  1798. if bf == nil or event[1] == bf then
  1799. ok, bf = coroutine.resume(b, table_unpack(event, 1, event.n))
  1800. if not ok then error(bf, 2) end
  1801. end
  1802. end
  1803. while coroutine.status(b) == "suspended" do
  1804. ok, bf = coroutine.resume(b, os_pullEvent())
  1805. if not ok then error(bf, 2) end
  1806. end
  1807. end
  1808.  
  1809. local datafmts = {
  1810. {"bbbbbbbb", 8, "signed"},
  1811. {"BBBBBBBB", 8, "unsigned"},
  1812. {"hhhhhhhh", 16, "signed"},
  1813. {"iiiiiiii", 32, "signed"},
  1814. {"ffffffff", 32, "float"},
  1815. {"i3i3i3i3i3i3i3i3", 24, "signed"},
  1816. {"IIIIIIII", 32, "unsigned"},
  1817. {"I3I3I3I3I3I3I3I3", 24, "unsigned"},
  1818. {"HHHHHHHH", 16, "unsigned"},
  1819. }
  1820.  
  1821. --- Detect the type of audio file from the specified data. This uses heuristic
  1822. --- detection methods to attempt to find the correct data type for files without
  1823. --- headers. It is not recommended to rely on the data type/bit depth reported
  1824. --- for PCM files - they are merely a suggestion.
  1825. ---@param data string The audio file to check
  1826. ---@return "pcm"|"dfpwm"|"wav"|"aiff"|"au"|"flac"|nil _ The type of audio file detected, or `nil` if none could be found
  1827. ---@return number|nil _ The bit depth for PCM data, if the type is "pcm" and the bit depth can be detected
  1828. ---@return "signed"|"unsigned"|"float"|nil _ The data type for PCM data, if the type is "pcm" and the type can be detected
  1829. function aukit.detect(data)
  1830. expect(1, data, "string")
  1831. if data:match "^RIFF....WAVE" then return "wav"
  1832. elseif data:match "^FORM....AIF[FC]" then return "aiff"
  1833. elseif data:match "^%.snd" then return "au"
  1834. elseif data:match "^fLaC" then return "flac"
  1835. else
  1836. -- Detect data type otherwise
  1837. -- This expects the start or end of the audio to be (near) silence
  1838. for _, bits in pairs(datafmts) do
  1839. local mid, gap = bits[3] == "unsigned" and 2^(bits[2]-1) or 0, bits[3] == "float" and 0.001 or 8 * 2^(bits[2]-8)
  1840. local nums = {pcall(str_unpack, bits[1], data)}
  1841. nums[#nums] = nil
  1842. if table_remove(nums, 1) then
  1843. local allzero, ok = true, true
  1844. for _, v in ipairs(nums) do
  1845. if v ~= mid then allzero = false end
  1846. if v < mid - gap or v > mid + gap then ok = false break end
  1847. end
  1848. ---@diagnostic disable-next-line return-type-mismatch
  1849. if ok and not allzero then return "pcm", table_unpack(bits, 2) end
  1850. end
  1851. nums = {pcall(str_unpack, bits[1], data, #data - bits[2])}
  1852. nums[#nums] = nil
  1853. if table_remove(nums, 1) then
  1854. local allzero, ok = true, true
  1855. for _, v in ipairs(nums) do
  1856. if v ~= mid then allzero = false end
  1857. if v < mid - gap or v > mid + gap then ok = false break end
  1858. end
  1859. ---@diagnostic disable-next-line return-type-mismatch
  1860. if ok and not allzero then return "pcm", table_unpack(bits, 2) end
  1861. end
  1862. end
  1863. if data:match(("\x55"):rep(12)) or data:match(("\xAA"):rep(12)) then return "dfpwm" end
  1864. end
  1865. return nil
  1866. end
  1867.  
  1868. --[[
  1869. ..######..########.########..########....###....##.....##
  1870. .##....##....##....##.....##.##.........##.##...###...###
  1871. .##..........##....##.....##.##........##...##..####.####
  1872. ..######.....##....########..######...##.....##.##.###.##
  1873. .......##....##....##...##...##.......#########.##.....##
  1874. .##....##....##....##....##..##.......##.....##.##.....##
  1875. ..######.....##....##.....##.########.##.....##.##.....##
  1876. ]]
  1877.  
  1878. --- aukit.stream
  1879. ---@section aukit.stream
  1880.  
  1881. --- Returns an iterator to stream raw PCM audio in CC format. Audio will
  1882. --- automatically be resampled to 48 kHz, and optionally mixed down to mono. Data
  1883. --- *must* be interleaved - this will not work with planar audio.
  1884. ---@param data string|table|function The audio data, either as a raw string, a
  1885. --- table of values (in the format specified by `bitDepth` and `dataType`), or a
  1886. --- function that returns either of those types. Functions will be called at
  1887. --- least once before returning to get the type of data to use.
  1888. ---@param bitDepth? number The bit depth of the audio (8, 16, 24, 32); if `dataType` is "float" then this must be 32
  1889. ---@param dataType? "signed"|"unsigned"|"float" The type of each sample
  1890. ---@param channels? number The number of channels present in the audio
  1891. ---@param sampleRate? number The sample rate of the audio in Hertz
  1892. ---@param bigEndian? boolean Whether the audio is big-endian or little-endian; ignored if data is a table
  1893. ---@param mono? boolean Whether to mix the audio down to mono
  1894. ---@return fun():number[][]|nil,number|nil _ An iterator function that returns
  1895. --- chunks of each channel's data as arrays of signed 8-bit 48kHz PCM, as well as
  1896. --- the current position of the audio in seconds
  1897. ---@return number _ The total length of the audio in seconds, or the length of
  1898. --- the first chunk if using a function
  1899. function aukit.stream.pcm(data, bitDepth, dataType, channels, sampleRate, bigEndian, mono)
  1900. local fn, complete
  1901. if type(data) == "function" then fn, data = data, data() end
  1902. expect(1, data, "string", "table")
  1903. bitDepth = expect(2, bitDepth, "number", "nil") or 8
  1904. dataType = expect(3, dataType, "string", "nil") or "signed"
  1905. channels = expect(4, channels, "number", "nil") or 1
  1906. sampleRate = expect(5, sampleRate, "number", "nil") or 48000
  1907. expect(6, bigEndian, "boolean", "nil")
  1908. expect(7, mono, "boolean", "nil")
  1909. if bitDepth ~= 8 and bitDepth ~= 16 and bitDepth ~= 24 and bitDepth ~= 32 then error("bad argument #2 (invalid bit depth)", 2) end
  1910. if dataType ~= "signed" and dataType ~= "unsigned" and dataType ~= "float" then error("bad argument #3 (invalid data type)", 2) end
  1911. if dataType == "float" and bitDepth ~= 32 then error("bad argument #2 (float audio must have 32-bit depth)", 2) end
  1912. expect.range(channels, 1)
  1913. expect.range(sampleRate, 1)
  1914. if channels == 1 then mono = false end
  1915. local byteDepth = bitDepth / 8
  1916. local len = (#data / (type(data) == "table" and 1 or byteDepth)) / channels
  1917. local csize = jit and 7680 or 32768
  1918. local csizeb = csize * byteDepth
  1919. local bitDir = bigEndian and ">" or "<"
  1920. local sformat = dataType == "float" and "f" or ((dataType == "signed" and "i" or "I") .. byteDepth)
  1921. local format = bitDir .. str_rep(sformat, csize)
  1922. local maxValue = 2^(bitDepth-1)
  1923. local pos, spos = 1, 1
  1924. local tmp = {}
  1925. local read
  1926. if type(data) == "table" then
  1927. if dataType == "signed" then
  1928. function read()
  1929. if complete then return nil end
  1930. if fn and pos > #data then
  1931. data, pos = fn(), 1
  1932. if not data then complete = true return nil end
  1933. end
  1934. local s = data[pos]
  1935. pos = pos + 1
  1936. return s / (s < 0 and maxValue or maxValue-1)
  1937. end
  1938. elseif dataType == "unsigned" then
  1939. function read()
  1940. if complete then return nil end
  1941. if fn and pos > #data then
  1942. data, pos = fn(), 1
  1943. if not data then complete = true return nil end
  1944. end
  1945. local s = data[pos]
  1946. pos = pos + 1
  1947. return (s - 128) / (s < 128 and maxValue or maxValue-1)
  1948. end
  1949. else
  1950. function read()
  1951. if complete then return nil end
  1952. if fn and pos > #data then
  1953. data, pos = fn(), 1
  1954. if not data then complete = true return nil end
  1955. end
  1956. local s = data[pos]
  1957. pos = pos + 1
  1958. return s
  1959. end
  1960. end
  1961. elseif dataType == "float" then
  1962. function read()
  1963. if complete then return nil end
  1964. if pos > #tmp then
  1965. if fn and spos > #data then
  1966. data, spos = fn(), 1
  1967. if not data then complete = true return nil end
  1968. end
  1969. if spos + csizeb > #data then
  1970. local f = bitDir .. str_rep(sformat, (#data - spos + 1) / byteDepth)
  1971. tmp = {str_unpack(f, data, spos)}
  1972. spos = tmp[#tmp]
  1973. tmp[#tmp] = nil
  1974. else
  1975. tmp = {str_unpack(format, data, spos)}
  1976. spos = tmp[#tmp]
  1977. tmp[#tmp] = nil
  1978. end
  1979. pos = 1
  1980. end
  1981. local s = tmp[pos]
  1982. pos = pos + 1
  1983. return s
  1984. end
  1985. elseif dataType == "signed" then
  1986. function read()
  1987. if complete then return nil end
  1988. if pos > #tmp then
  1989. if fn and spos > #data then
  1990. data, spos = fn(), 1
  1991. if not data then complete = true return nil end
  1992. end
  1993. if spos + csizeb > #data then
  1994. local f = bitDir .. str_rep(sformat, (#data - spos + 1) / byteDepth)
  1995. tmp = {str_unpack(f, data, spos)}
  1996. spos = tmp[#tmp]
  1997. tmp[#tmp] = nil
  1998. else
  1999. tmp = {str_unpack(format, data, spos)}
  2000. spos = tmp[#tmp]
  2001. tmp[#tmp] = nil
  2002. end
  2003. pos = 1
  2004. end
  2005. local s = tmp[pos]
  2006. pos = pos + 1
  2007. return s / (s < 0 and maxValue or maxValue-1)
  2008. end
  2009. else -- unsigned
  2010. function read()
  2011. if complete then return nil end
  2012. if pos > #tmp then
  2013. if fn and spos > #data then
  2014. data, spos = fn(), 1
  2015. if not data then complete = true return nil end
  2016. end
  2017. if spos + csizeb > #data then
  2018. local f = bitDir .. str_rep(sformat, (#data - spos + 1) / byteDepth)
  2019. tmp = {str_unpack(f, data, spos)}
  2020. spos = tmp[#tmp]
  2021. tmp[#tmp] = nil
  2022. else
  2023. tmp = {str_unpack(format, data, spos)}
  2024. spos = tmp[#tmp]
  2025. tmp[#tmp] = nil
  2026. end
  2027. pos = 1
  2028. end
  2029. local s = tmp[pos]
  2030. pos = pos + 1
  2031. return (s - 128) / (s < 128 and maxValue or maxValue-1)
  2032. end
  2033. end
  2034. local d = {}
  2035. local ratio = 48000 / sampleRate
  2036. local lp_alpha = 1 - math.exp(-(sampleRate / 96000) * 2 * math_pi)
  2037. local interp = interpolate[aukit.defaultInterpolation]
  2038. for j = 1, (mono and 1 or channels) do d[j] = setmetatable({}, {__index = function(self, i)
  2039. if mono then for _ = 1, channels do self[i] = (rawget(self, i) or 0) + read() end self[i] = self[i] / channels
  2040. else self[i] = read() end
  2041. return rawget(self, i)
  2042. end}) end
  2043. local n = 0
  2044. local ok = true
  2045. return function()
  2046. if not ok or complete then return nil end
  2047. for i = (n == 0 and interpolation_start[aukit.defaultInterpolation] or 1), interpolation_end[aukit.defaultInterpolation] do
  2048. if mono then
  2049. local s = 0
  2050. for j = 1, channels do
  2051. local c = read()
  2052. if not c then return nil end
  2053. s = s + c
  2054. end
  2055. d[1][i] = s / channels
  2056. else for j = 1, channels do d[j][i] = read() if not d[j][i] then return nil end end end
  2057. end
  2058. local chunk = {}
  2059. for j = 1, #d do chunk[j] = {} end
  2060. ok = pcall(function()
  2061. local ls = {}
  2062. for y = 1, #d do
  2063. local s = chunk[y][0] or 0
  2064. ls[y] = s / (s < 0 and 128 or 127)
  2065. end
  2066. for i = 1, 48000 do
  2067. for y = 1, #d do
  2068. local x = ((i - 1) / ratio) + 1
  2069. local s
  2070. if x % 1 == 0 then s = d[y][x]
  2071. else s = interp(d[y], x) end
  2072. local ns = ls[y] + lp_alpha * (s - ls[y])
  2073. chunk[y][i] = clamp(ns * (ns < 0 and 128 or 127), -128, 127)
  2074. ls[y] = s
  2075. end
  2076. end
  2077. end)
  2078. if #chunk[1] == 0 then return nil end
  2079. n = n + #chunk[1]
  2080. for y = 1, #d do
  2081. if aukit.defaultInterpolation == "sinc" then
  2082. local t, l = {}, #d[y]
  2083. for i = -sincWindowSize, 0 do
  2084. t[i] = d[y][l + i]
  2085. end
  2086. d[y] = setmetatable(t, getmetatable(d[y]))
  2087. else
  2088. local l2, l1 = d[y][#d[y]-1], d[y][#d[y]]
  2089. d[y] = setmetatable({}, getmetatable(d[y]))
  2090. d[y][-1], d[y][0] = l2, l1
  2091. end
  2092. end
  2093. return chunk, (n - #chunk[1]) / 48000
  2094. end, len / sampleRate
  2095. end
  2096.  
  2097. --- Returns an iterator to stream audio from DFPWM data. Audio will automatically
  2098. --- be resampled to 48 kHz. Multiple channels are expected to be interleaved in
  2099. --- the encoded DFPWM data.
  2100. ---@param data string|function():string The DFPWM data to decode, or a function
  2101. --- returning chunks to decode
  2102. ---@param sampleRate? number The sample rate of the audio in Hertz
  2103. ---@param channels? number The number of channels present in the audio
  2104. ---@param mono? boolean Whether to mix the audio down to mono
  2105. ---@return fun():number[][]|nil,number|nil _ An iterator function that returns
  2106. --- chunks of each channel's data as arrays of signed 8-bit 48kHz PCM, as well as
  2107. --- the current position of the audio in seconds
  2108. ---@return number|nil _ The total length of the audio in seconds, or nil if data
  2109. --- is a function
  2110. function aukit.stream.dfpwm(data, sampleRate, channels, mono)
  2111. expect(1, data, "string", "function")
  2112. sampleRate = expect(2, sampleRate, "number", "nil") or 48000
  2113. channels = expect(3, channels, "number", "nil") or 1
  2114. expect.range(sampleRate, 1)
  2115. expect.range(channels, 1)
  2116. if channels == 1 then mono = false end
  2117. local decoder = dfpwm.make_decoder()
  2118. local pos = 1
  2119. local last = 0
  2120. local isstr = type(data) == "string"
  2121. local buf = ""
  2122. return function()
  2123. local d
  2124. if isstr then
  2125. if pos > #data then return nil end
  2126. d = str_sub(data, pos, pos + 6000 * channels)
  2127. else
  2128. while #buf < sampleRate / 8 * channels do
  2129. local chunk = data()
  2130. if not chunk then
  2131. if #buf == 0 then return nil
  2132. else break end
  2133. end
  2134. buf = buf .. chunk
  2135. end
  2136. d = str_sub(buf, 1, 6000 * channels)
  2137. buf = str_sub(buf, 6000 * channels + 1)
  2138. end
  2139. local audio = decoder(d)
  2140. if audio == nil or #audio == 0 then return nil end
  2141. audio[0], last = last, audio[#audio]
  2142. os_queueEvent("nosleep")
  2143. repeat until "nosleep" == os_pullEvent()
  2144. local ratio = 48000 / sampleRate
  2145. local newlen = #audio * ratio
  2146. local interp = interpolate[aukit.defaultInterpolation]
  2147. local lines = {{}}
  2148. if not mono then for j = 1, channels do lines[j] = {} end end
  2149. for i = 1, newlen, channels do
  2150. local n = 0
  2151. for j = 1, channels do
  2152. local x = (i - 1) / ratio + 1
  2153. local s
  2154. if x % 1 == 0 then s = audio[x]
  2155. else s = clamp(interp(audio, x), -128, 127) end
  2156. if mono then n = n + s
  2157. else lines[j][math_ceil(i / channels)] = s end
  2158. end
  2159. if mono then lines[1][math_ceil(i / channels)] = n / channels end
  2160. end
  2161. os_queueEvent("nosleep")
  2162. repeat until "nosleep" == os_pullEvent()
  2163. local p = pos
  2164. pos = pos + 6000 * channels
  2165. return lines, p * 8 / sampleRate / channels
  2166. end, isstr and #data * 8 / sampleRate / channels or nil
  2167. end
  2168.  
  2169. --- Returns an iterator to stream audio from Microsoft ADPCM data. Audio will
  2170. --- automatically be resampled to 48 kHz.
  2171. ---@param input string|function():string The audio data as a raw string or
  2172. --- reader function
  2173. ---@param blockAlign number The number of bytes in each block
  2174. ---@param channels? number The number of channels present in the audio
  2175. ---@param sampleRate? number The sample rate of the audio in Hertz
  2176. ---@param mono? boolean Whether to mix the audio down to mono
  2177. ---@param coefficients? table Two lists of coefficients to use
  2178. ---@return fun():number[][]|nil,number|nil _ An iterator function that returns
  2179. --- chunks of each channel's data as arrays of signed 8-bit 48kHz PCM, as well as
  2180. --- the current position of the audio in seconds
  2181. ---@return number|nil _ The total length of the audio in seconds, or nil if data
  2182. --- is a function
  2183. function aukit.stream.msadpcm(input, blockAlign, channels, sampleRate, mono, coefficients)
  2184. expect(1, input, "string", "function")
  2185. expect(2, blockAlign, "number")
  2186. channels = expect(3, channels, "number", "nil") or 1
  2187. sampleRate = expect(4, sampleRate, "number", "nil") or 48000
  2188. expect(5, mono, "boolean", "nil")
  2189. expect(6, coefficients, "table", "nil")
  2190. expect.range(sampleRate, 1)
  2191. local isfunc = type(input) == "function"
  2192. local coeff1, coeff2
  2193. if coefficients then
  2194. if type(coefficients[1]) ~= "table" then error("bad argument #5 (first entry is not a table)", 2) end
  2195. if type(coefficients[2]) ~= "table" then error("bad argument #5 (second entry is not a table)", 2) end
  2196. if #coefficients[1] ~= #coefficients[2] then error("bad argument #5 (lists are not the same length)", 2) end
  2197. coeff1, coeff2 = {}, {}
  2198. for i, v in ipairs(coefficients[1]) do
  2199. if type(v) ~= "number" then error("bad entry #" .. i .. " in coefficient list 1 (expected number, got " .. type(v) .. ")", 2) end
  2200. coeff1[i-1] = v
  2201. end
  2202. for i, v in ipairs(coefficients[2]) do
  2203. if type(v) ~= "number" then error("bad entry #" .. i .. " in coefficient list 2 (expected number, got " .. type(v) .. ")", 2) end
  2204. coeff2[i-1] = v
  2205. end
  2206. else coeff1, coeff2 = {[0] = 256, 512, 0, 192, 240, 460, 392}, {[0] = 0, -256, 0, 64, 0, -208, -232} end
  2207. local ratio = 48000 / sampleRate
  2208. local interp = interpolate[aukit.defaultInterpolation]
  2209. local n, pos = 1, 0
  2210. local data = isfunc and input() or input
  2211. if channels == 2 then
  2212. local samplesPerBlock = blockAlign - 14
  2213. local iterPerSecond = math_ceil(sampleRate / samplesPerBlock)
  2214. local bytesPerSecond = blockAlign * iterPerSecond
  2215. local newlen = math_floor(samplesPerBlock * ratio)
  2216. local lastL, lastR
  2217. return function()
  2218. if data == nil then return nil end
  2219. local target = n + bytesPerSecond
  2220. local retval = {{}, not mono and {} or nil}
  2221. local rp = 0
  2222. local start = os_epoch "utc"
  2223. while n < target do
  2224. if os_epoch "utc" - start > 3000 then
  2225. os_queueEvent("nosleep")
  2226. repeat until "nosleep" == os_pullEvent()
  2227. start = os_epoch "utc"
  2228. end
  2229. if isfunc and n > #data then
  2230. pos = pos + #data
  2231. n = n - #data
  2232. data = input()
  2233. if data == nil then return nil end
  2234. end
  2235. if n > #data then break end
  2236. local left, right = {}, {}
  2237. if lastL then for i = 1, #lastL do
  2238. left[i-#lastL-1] = lastL[i]
  2239. right[i-#lastR-1] = lastR[i]
  2240. end end
  2241. local predictorIndexL, predictorIndexR, deltaL, deltaR, sample1L, sample1R, sample2L, sample2R = str_unpack("<BBhhhhhh", data, n)
  2242. local c1L, c2L, c1R, c2R = coeff1[predictorIndexL], coeff2[predictorIndexL], coeff1[predictorIndexR], coeff2[predictorIndexR]
  2243. left[1] = math_floor(sample2L / (sample2L < 0 and 128 or 127))
  2244. left[2] = math_floor(sample1L / (sample1L < 0 and 128 or 127))
  2245. right[1] = math_floor(sample2R / (sample2R < 0 and 128 or 127))
  2246. right[2] = math_floor(sample1R / (sample1R < 0 and 128 or 127))
  2247. for i = 14, blockAlign - 1 do
  2248. local b = str_byte(data, n+i)
  2249. local hi, lo = bit32_rshift(b, 4), bit32_band(b, 0x0F)
  2250. if hi >= 8 then hi = hi - 16 end
  2251. if lo >= 8 then lo = lo - 16 end
  2252. local predictor = clamp(math_floor((sample1L * c1L + sample2L * c2L) / 256) + hi * deltaL, -32768, 32767)
  2253. left[#left+1] = math_floor(predictor / (predictor < 0 and 128 or 127))
  2254. sample2L, sample1L = sample1L, predictor
  2255. deltaL = math_max(math_floor(msadpcm_adaption_table[hi] * deltaL / 256), 16)
  2256. predictor = clamp(math_floor((sample1R * c1R + sample2R * c2R) / 256) + lo * deltaR, -32768, 32767)
  2257. right[#right+1] = math_floor(predictor / (predictor < 0 and 128 or 127))
  2258. sample2R, sample1R = sample1R, predictor
  2259. deltaR = math_max(math_floor(msadpcm_adaption_table[lo] * deltaR / 256), 16)
  2260. end
  2261. lastL, lastR = left, right
  2262. for i = 1, newlen do
  2263. local x = (i - 1) / ratio + 1
  2264. local l, r
  2265. if x % 1 == 0 then l, r = left[x], right[x]
  2266. else l, r = interp(left, x), interp(right, x) end
  2267. if mono then retval[1][rp+i] = clamp(math_floor(l + r / 2), -128, 127)
  2268. else retval[1][rp+i], retval[2][rp+i] = clamp(math_floor(l), -128, 127), clamp(math_floor(r), -128, 127) end
  2269. end
  2270. rp = rp + newlen
  2271. n = n + blockAlign
  2272. end
  2273. if #retval[1] == 0 then return nil end
  2274. return retval, (n + pos) / bytesPerSecond
  2275. end, not isfunc and #data / blockAlign * samplesPerBlock / sampleRate or nil
  2276. elseif channels == 1 then
  2277. local samplesPerBlock = (blockAlign - 7) * 2
  2278. local iterPerSecond = math_ceil(sampleRate / samplesPerBlock)
  2279. local bytesPerSecond = blockAlign * iterPerSecond
  2280. local newlen = math_floor(samplesPerBlock * ratio)
  2281. return function()
  2282. if data == nil then return nil end
  2283. local target = n + bytesPerSecond
  2284. local retval = {{}}
  2285. local rp = 0
  2286. local start = os_epoch "utc"
  2287. while n < target do
  2288. if os_epoch "utc" - start > 3000 then
  2289. os_queueEvent("nosleep")
  2290. repeat until "nosleep" == os_pullEvent()
  2291. start = os_epoch "utc"
  2292. end
  2293. if isfunc and n > #data then
  2294. pos = pos + #data
  2295. n = n - #data
  2296. data = input()
  2297. if data == nil then return nil end
  2298. end
  2299. if n > #data then break end
  2300. local left = {}
  2301. local predictorIndex, delta, sample1, sample2 = str_unpack("<!1Bhhh", data)
  2302. local c1, c2 = coeff1[predictorIndex], coeff2[predictorIndex]
  2303. left[1] = sample2 / (sample2 < 0 and 128 or 127)
  2304. left[2] = sample1 / (sample1 < 0 and 128 or 127)
  2305. for i = 7, blockAlign - 1 do
  2306. local b = str_byte(data, n+i)
  2307. local hi, lo = bit32_rshift(b, 4), bit32_band(b, 0x0F)
  2308. if hi >= 8 then hi = hi - 16 end
  2309. if lo >= 8 then lo = lo - 16 end
  2310. local predictor = clamp(math_floor((sample1 * c1 + sample2 * c2) / 256) + hi * delta, -32768, 32767)
  2311. left[#left+1] = predictor / (predictor < 0 and 128 or 127)
  2312. sample2, sample1 = sample1, predictor
  2313. delta = math_max(math_floor(msadpcm_adaption_table[hi] * delta / 256), 16)
  2314. predictor = clamp(math_floor((sample1 * c1 + sample2 * c2) / 256) + lo * delta, -32768, 32767)
  2315. left[#left+1] = predictor / (predictor < 0 and 128 or 127)
  2316. sample2, sample1 = sample1, predictor
  2317. delta = math_max(math_floor(msadpcm_adaption_table[lo] * delta / 256), 16)
  2318. end
  2319. for i = 1, newlen do
  2320. local x = (i - 1) / ratio + 1
  2321. if x % 1 == 0 then retval[1][rp+i] = clamp(math_floor(left[x]), -128, 127)
  2322. else retval[1][rp+i] = clamp(math_floor(interp(left, x)), -128, 127) end
  2323. end
  2324. rp = rp + newlen
  2325. n = n + blockAlign
  2326. end
  2327. if #retval[1] == 0 then return nil end
  2328. return retval, (n + pos) / bytesPerSecond
  2329. end, not isfunc and #data / blockAlign * samplesPerBlock / sampleRate or nil
  2330. else error("Unsupported number of channels: " .. channels) end
  2331. end
  2332.  
  2333. --- Returns an iterator to stream data from IMA ADPCM data. Audio will
  2334. --- automatically be resampled to 48 kHz, and mixed to mono if desired. Data
  2335. --- *must* be in the interleaving format used in WAV files (i.e. periodic blocks
  2336. --- with 4/8-byte headers, channels alternating every 4 bytes, lower nibble first).
  2337. ---@param input string|function():string The audio data as a raw string or
  2338. --- reader function
  2339. ---@param blockAlign number The number of bytes in each block
  2340. ---@param channels? number The number of channels present in the audio
  2341. ---@param sampleRate? number The sample rate of the audio in Hertz
  2342. ---@param mono? boolean Whether to mix the audio down to mono
  2343. ---@return fun():number[][]|nil,number|nil _ An iterator function that returns
  2344. --- chunks of each channel's data as arrays of signed 8-bit 48kHz PCM, as well as
  2345. --- the current position of the audio in seconds
  2346. ---@return number|nil _ The total length of the audio in seconds, or nil if data
  2347. --- is a function
  2348. function aukit.stream.adpcm(input, blockAlign, channels, sampleRate, mono)
  2349. expect(1, input, "string", "function")
  2350. expect(2, blockAlign, "number")
  2351. channels = expect(3, channels, "number", "nil") or 1
  2352. sampleRate = expect(4, sampleRate, "number", "nil") or 48000
  2353. expect(5, mono, "boolean", "nil")
  2354. expect.range(sampleRate, 1)
  2355. local isfunc = type(input) == "function"
  2356. local ratio = 48000 / sampleRate
  2357. local interp = interpolate[aukit.defaultInterpolation]
  2358. local n, pos = 1, 0
  2359. local data = isfunc and input() or input
  2360. local samplesPerBlock = (blockAlign - 4 * channels) * 2 / channels
  2361. local iterPerSecond = math_ceil(sampleRate / samplesPerBlock)
  2362. local bytesPerSecond = blockAlign * iterPerSecond
  2363. local newlen = math_floor(samplesPerBlock * ratio)
  2364. local last
  2365. return function()
  2366. if data == nil then return nil end
  2367. local target = n + bytesPerSecond
  2368. local retval = {{}}
  2369. if not mono then for i = 2, channels do retval[i] = {} end end
  2370. local rp = 0
  2371. if isfunc and target > #data then
  2372. pos = pos + n - 1
  2373. target = target - n + 1
  2374. data = str_sub(data, n)
  2375. n = 1
  2376. while #data < target do
  2377. local d = input()
  2378. if not d then break end
  2379. data = data .. d
  2380. end
  2381. end
  2382. local start = os_epoch "utc"
  2383. while n < target do
  2384. if os_epoch "utc" - start > 3000 then
  2385. os_queueEvent("nosleep")
  2386. repeat until "nosleep" == os_pullEvent()
  2387. start = os_epoch "utc"
  2388. end
  2389. if n + channels * 4 > #data then break end
  2390. local d = {}
  2391. for i = 1, channels do d[i] = {} end
  2392. if last then for i = 1, channels do for j = 1, #last[i] do d[j-#last[i]-1] = last[i][j] end end end
  2393. local predictor, step_index, step = {}, {}, {}
  2394. for i = 1, channels do predictor[i], step_index[i] = str_unpack("<hB", data, n + (i - 1) * 4) end
  2395. for i = channels * 4, blockAlign, channels * 4 do
  2396. local p = (i - channels * 4) / channels * 2 + 1
  2397. if #data < n + i + channels*4 then break end
  2398. for j = 1, channels do
  2399. local num = str_unpack("<I", data, n + i + (j-1)*4)
  2400. for k = 0, 7 do
  2401. local nibble = bit32_extract(num, k*4, 4)
  2402. step[j] = ima_step_table[step_index[j]]
  2403. step_index[j] = clamp(step_index[j] + ima_index_table[nibble], 0, 88)
  2404. local diff = bit32_rshift((nibble % 8) * step[j], 2) + bit32_rshift(step[j], 3)
  2405. if nibble >= 8 then predictor[j] = clamp(predictor[j] - diff, -32768, 32767)
  2406. else predictor[j] = clamp(predictor[j] + diff, -32768, 32767) end
  2407. d[j][p+k] = predictor[j] / (predictor[j] < 0 and 128 or 127)
  2408. end
  2409. end
  2410. end
  2411. last = d
  2412. if #d[1] < samplesPerBlock then newlen = math_floor(#d[1] * ratio) end
  2413. for i = 1, newlen do
  2414. local x = (i - 1) / ratio + 1
  2415. local c = {}
  2416. if x % 1 == 0 then for j = 1, channels do c[j] = d[j][x] end
  2417. else for j = 1, channels do c[j] = interp(d[j], x) end end
  2418. if mono then
  2419. local n = 0
  2420. for j = 1, channels do n = n + c[j] end
  2421. retval[1][rp+i] = clamp(math_floor(n / channels), -128, 127)
  2422. else for j = 1, channels do retval[j][rp+i] = clamp(math_floor(c[j]), -128, 127) end end
  2423. end
  2424. rp = rp + newlen
  2425. n = n + blockAlign
  2426. end
  2427. if #retval[1] == 0 then return nil end
  2428. return retval, (n + pos) / bytesPerSecond
  2429. end, not isfunc and #data / blockAlign * samplesPerBlock / sampleRate or nil
  2430. end
  2431.  
  2432. --- Returns an iterator to stream data from u-law/A-law G.711 data. Audio will
  2433. --- automatically be resampled to 48 kHz, and mixed to mono if desired.
  2434. ---@param input string|function():string The audio data as a raw string or
  2435. --- reader function
  2436. ---@param ulaw boolean Whether the audio uses u-law (true) or A-law (false).
  2437. ---@param channels? number The number of channels present in the audio
  2438. ---@param sampleRate? number The sample rate of the audio in Hertz
  2439. ---@param mono? boolean Whether to mix the audio down to mono
  2440. ---@return fun():number[][]|nil,number|nil _ An iterator function that returns
  2441. --- chunks of each channel's data as arrays of signed 8-bit 48kHz PCM, as well as
  2442. --- the current position of the audio in seconds
  2443. ---@return number|nil _ The total length of the audio in seconds, or nil if data
  2444. --- is a function
  2445. function aukit.stream.g711(input, ulaw, channels, sampleRate, mono)
  2446. expect(1, input, "string", "function")
  2447. expect(2, ulaw, "boolean")
  2448. channels = expect(3, channels, "number", "nil") or 1
  2449. sampleRate = expect(4, sampleRate, "number", "nil") or 8000
  2450. expect(5, mono, "boolean", "nil")
  2451. local csize = jit and 7680 or 32768
  2452. local xor = ulaw and 0xFF or 0x55
  2453. local isfunc = type(input) == "function"
  2454. local buf, pos = "", 1
  2455. local ratio = 48000 / sampleRate
  2456. local interp = interpolate[aukit.defaultInterpolation]
  2457. local last
  2458. return function()
  2459. local lp = pos
  2460. local retval = {}
  2461. for i = 1, channels do retval[i] = {} end
  2462. if last then for i = 1, channels do for j = 1, #last[i] do retval[j-#last[i]-1] = last[i][j] end end end
  2463. local data
  2464. if isfunc then
  2465. while #buf < sampleRate * channels do
  2466. local d = input()
  2467. if not input then
  2468. if #buf == 0 then return nil
  2469. else break end
  2470. end
  2471. buf = buf .. d
  2472. end
  2473. data = str_sub(buf, 1, sampleRate * channels)
  2474. buf = str_sub(buf, sampleRate * channels + 1)
  2475. else data = str_sub(input, pos, pos + sampleRate * channels - 1) end
  2476. pos = pos + sampleRate * channels
  2477. local start = os_epoch "utc"
  2478. for i = 1, #data, csize do
  2479. local bytes = {str_byte(data, i, i + csize - 1)}
  2480. for j = 1, #bytes do
  2481. local b = bit32_bxor(bytes[j], xor)
  2482. local m, e = bit32_band(b, 0x0F), bit32_extract(b, 4, 3)
  2483. if not ulaw and e == 0 then m = m * 4 + 2
  2484. else m = bit32_lshift(m * 2 + 33, e) end
  2485. if ulaw then m = m - 33 end
  2486. retval[(i+j-2) % channels + 1][math_floor((i+j-2) / channels + 1)] = m / (bit32_btest(b, 0x80) == ulaw and -0x40 or 0x40)
  2487. end
  2488. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  2489. end
  2490. last = {}
  2491. for j = 1, channels do last[j] = {} for i = 1, sincWindowSize do last[j][i] = retval[j][#retval[j]-30+i] end end
  2492. local newlen = math_floor(#retval[1] * ratio)
  2493. local resamp = {}
  2494. for j = 1, channels do resamp[j] = {} end
  2495. for i = 1, newlen do
  2496. local x = (i - 1) / ratio + 1
  2497. local c = {}
  2498. if x % 1 == 0 then for j = 1, channels do c[j] = retval[j][x] end
  2499. else for j = 1, channels do c[j] = interp(retval[j], x) end end
  2500. if mono then
  2501. local n = 0
  2502. for j = 1, channels do n = n + c[j] end
  2503. resamp[1][i] = clamp(math_floor(n / channels), -128, 127)
  2504. else for j = 1, channels do resamp[j][i] = clamp(math_floor(c[j]), -128, 127) end end
  2505. end
  2506. return resamp, (lp - 1) / sampleRate / channels
  2507. end, not isfunc and #input / sampleRate / channels or nil
  2508. end
  2509.  
  2510. --- Returns an iterator to stream audio from a WAV file. Audio will automatically
  2511. --- be resampled to 48 kHz, and optionally mixed down to mono. This accepts PCM
  2512. --- files up to 32 bits, including float data, as well as DFPWM files [as specified here](https://gist.github.com/MCJack123/90c24b64c8e626c7f130b57e9800962c).
  2513. ---@param data string|function():string The WAV file to decode, or a function
  2514. --- returning chunks to decode (the first chunk MUST contain the ENTIRE header)
  2515. ---@param mono? boolean Whether to mix the audio to mono
  2516. ---@param ignoreHeader? boolean Whether to ignore additional headers
  2517. --- if they appear later in the audio stream
  2518. ---@return fun():number[][]|nil,number|nil _ An iterator function that returns
  2519. --- chunks of each channel's data as arrays of signed 8-bit 48kHz PCM, as well as
  2520. --- the current position of the audio in seconds
  2521. ---@return number|nil _ The total length of the audio in seconds
  2522. function aukit.stream.wav(data, mono, ignoreHeader)
  2523. local fn
  2524. if type(data) == "function" then fn, data = data, data() end
  2525. expect(1, data, "string")
  2526. local channels, sampleRate, bitDepth, length, dataType, blockAlign, coefficients
  2527. local temp, pos = str_unpack("c4", data)
  2528. if temp ~= "RIFF" then error("bad argument #1 (not a WAV file)", 2) end
  2529. pos = pos + 4
  2530. temp, pos = str_unpack("c4", data, pos)
  2531. if temp ~= "WAVE" then error("bad argument #1 (not a WAV file)", 2) end
  2532. while pos <= #data do
  2533. local size
  2534. temp, pos = str_unpack("c4", data, pos)
  2535. size, pos = str_unpack("<I", data, pos)
  2536. if temp == "fmt " then
  2537. local chunk = str_sub(data, pos, pos + size - 1)
  2538. pos = pos + size
  2539. local format
  2540. format, channels, sampleRate, blockAlign, bitDepth = str_unpack("<HHIxxxxHH", chunk)
  2541. if format == 1 then
  2542. dataType = bitDepth == 8 and "unsigned" or "signed"
  2543. elseif format == 2 then
  2544. dataType = "msadpcm"
  2545. local numcoeff = str_unpack("<H", chunk, 21)
  2546. if numcoeff > 0 then
  2547. coefficients = {{}, {}}
  2548. for i = 1, numcoeff do
  2549. coefficients[1][i], coefficients[2][i] = str_unpack("<hh", chunk, i * 4 + 19)
  2550. end
  2551. end
  2552. elseif format == 3 then
  2553. dataType = "float"
  2554. elseif format == 6 then
  2555. dataType = "alaw"
  2556. elseif format == 7 then
  2557. dataType = "ulaw"
  2558. elseif format == 0x11 then
  2559. dataType = "adpcm"
  2560. elseif format == 0xFFFE then
  2561. bitDepth = str_unpack("<H", chunk, 19)
  2562. local uuid = str_sub(chunk, 25, 40)
  2563. if uuid == wavExtensible.pcm then dataType = bitDepth == 8 and "unsigned" or "signed"
  2564. elseif uuid == wavExtensible.dfpwm then dataType = "dfpwm"
  2565. elseif uuid == wavExtensible.msadpcm then dataType = "msadpcm"
  2566. elseif uuid == wavExtensible.pcm_float then dataType = "float"
  2567. elseif uuid == wavExtensible.alaw then dataType = "alaw"
  2568. elseif uuid == wavExtensible.ulaw then dataType = "ulaw"
  2569. elseif uuid == wavExtensible.adpcm then dataType = "adpcm"
  2570. else error("unsupported WAV file", 2) end
  2571. else error("unsupported WAV file", 2) end
  2572. elseif temp == "data" then
  2573. local data = str_sub(data, pos, pos + size - 1)
  2574. if not fn and #data < size then error("invalid WAV file", 2) end
  2575. if fn then
  2576. local first, f = data
  2577. data = function()
  2578. if first then f, first = first return f
  2579. elseif ignoreHeader then
  2580. local d = fn()
  2581. if not d then return nil end
  2582. if d:match "^RIFF....WAVE" then return str_sub(d, d:match("^RIFF....WAVE.?data....()"))
  2583. else return d end
  2584. else return fn() end
  2585. end
  2586. end
  2587. if dataType == "adpcm" then return aukit.stream.adpcm(data, blockAlign, channels, sampleRate, mono)
  2588. elseif dataType == "msadpcm" then return aukit.stream.msadpcm(data, blockAlign, channels, sampleRate, mono, coefficients)
  2589. elseif dataType == "dfpwm" then return aukit.stream.dfpwm(data, sampleRate, channels, mono), size / channels / (bitDepth / 8) / sampleRate
  2590. elseif dataType == "alaw" or dataType == "ulaw" then return aukit.stream.g711(data, dataType == "ulaw", channels, sampleRate, mono)
  2591. else return aukit.stream.pcm(data, bitDepth, dataType, channels, sampleRate, false, mono), size / channels / (bitDepth / 8) / sampleRate end
  2592. elseif temp == "fact" then
  2593. -- TODO
  2594. pos = pos + size
  2595. else pos = pos + size end
  2596. end
  2597. error("invalid WAV file", 2)
  2598. end
  2599.  
  2600. --- Returns an iterator to stream audio from an AIFF or AIFC file. Audio will
  2601. --- automatically be resampled to 48 kHz, and optionally mixed down to mono.
  2602. ---@param data string|function():string The AIFF file to decode, or a function
  2603. --- returning chunks to decode (the first chunk MUST contain the ENTIRE header)
  2604. ---@param mono? boolean Whether to mix the audio to mono
  2605. ---@param ignoreHeader? boolean Whether to ignore additional headers
  2606. --- if they appear later in the audio stream
  2607. ---@return fun():number[][]|nil,number|nil _ An iterator function that returns
  2608. --- chunks of each channel's data as arrays of signed 8-bit 48kHz PCM, as well as
  2609. --- the current position of the audio in seconds
  2610. ---@return number _ The total length of the audio in seconds
  2611. function aukit.stream.aiff(data, mono, ignoreHeader)
  2612. local fn
  2613. if type(data) == "function" then fn, data = data, data() end
  2614. expect(1, data, "string")
  2615. expect(2, mono, "boolean", "nil")
  2616. local channels, sampleRate, bitDepth, length, offset, compression, blockAlign
  2617. local isAIFC = false
  2618. local temp, pos = str_unpack("c4", data)
  2619. if temp ~= "FORM" then error("bad argument #1 (not an AIFF file)", 2) end
  2620. pos = pos + 4
  2621. temp, pos = str_unpack("c4", data, pos)
  2622. if temp == "AIFC" then isAIFC = true
  2623. elseif temp ~= "AIFF" then error("bad argument #1 (not an AIFF file)", 2) end
  2624. while pos <= #data do
  2625. local size
  2626. temp, pos = str_unpack("c4", data, pos)
  2627. size, pos = str_unpack(">I", data, pos)
  2628. if temp == "COMM" then
  2629. local e, m
  2630. channels, length, bitDepth, e, m, pos = str_unpack(">hIhHI7x", data, pos)
  2631. if isAIFC then
  2632. local s
  2633. compression, s, pos = str_unpack(">c4s1", data, pos)
  2634. if #s % 2 == 0 then pos = pos + 1 end
  2635. end
  2636. length = length * channels * math_floor(bitDepth / 8)
  2637. local s = bit32_btest(e, 0x8000)
  2638. e = ((bit32_band(e, 0x7FFF) - 0x3FFE) % 0x800)
  2639. sampleRate = math.ldexp(m * (s and -1 or 1) / 0x100000000000000, e)
  2640. elseif temp == "SSND" then
  2641. offset, blockAlign, pos = str_unpack(">II", data, pos)
  2642. local data = str_sub(data, pos + offset, pos + offset + length - 1)
  2643. if not fn and #data < length then error("invalid AIFF file", 2) end
  2644. if fn then
  2645. local first, f = data
  2646. data = function()
  2647. if first then f, first = first return f
  2648. elseif ignoreHeader then
  2649. local d = fn()
  2650. if not d then return nil end
  2651. if d:match "^FORM....AIF[FC]" then
  2652. local n, p = d:match("^FORM....AIF[FC].-SSND(....)....()")
  2653. offset = str_unpack(">I", n)
  2654. return str_sub(d, p + offset)
  2655. else return d end
  2656. else return fn() end
  2657. end
  2658. end
  2659. if compression == nil or compression == "NONE" then return aukit.stream.pcm(data, bitDepth, "signed", channels, sampleRate, true, mono), length / channels / (bitDepth / 8) / sampleRate
  2660. elseif compression == "sowt" then return aukit.stream.pcm(data, bitDepth, "signed", channels, sampleRate, true, mono), length / channels / (bitDepth / 8) / sampleRate
  2661. elseif compression == "fl32" or compression == "FL32" then return aukit.stream.pcm(data, 32, "float", channels, sampleRate, true, mono), length / channels / 4 / sampleRate
  2662. elseif compression == "alaw" or compression == "ulaw" or compression == "ALAW" or compression == "ULAW" then return aukit.stream.g711(data, compression == "ulaw" or compression == "ULAW", channels, sampleRate, mono), length / channels / sampleRate
  2663. else error("Unsupported compression scheme " .. compression, 2) end
  2664. return aukit.stream.pcm(data, bitDepth, "signed", channels, sampleRate, true, mono), length / channels / (bitDepth / 8) / sampleRate
  2665. else pos = pos + size end
  2666. end
  2667. error("invalid AIFF file", 2)
  2668. end
  2669.  
  2670. --- Returns an iterator to stream data from an AU file. Audio will automatically
  2671. --- be resampled to 48 kHz, and optionally mixed down to mono.
  2672. ---@param data string|function():string The AU file to decode, or a function
  2673. --- returning chunks to decode (the first chunk MUST contain the ENTIRE header)
  2674. ---@param mono? boolean Whether to mix the audio to mono
  2675. ---@param ignoreHeader? boolean Whether to ignore additional headers
  2676. --- if they appear later in the audio stream
  2677. ---@return fun():number[][]|nil,number|nil _ An iterator function that returns
  2678. --- chunks of each channel's data as arrays of signed 8-bit 48kHz PCM, as well as
  2679. --- the current position of the audio in seconds
  2680. ---@return number _ The total length of the audio in seconds
  2681. function aukit.stream.au(data, mono, ignoreHeader)
  2682. local fn
  2683. if type(data) == "function" then fn, data = data, data() end
  2684. expect(1, data, "string")
  2685. expect(2, mono, "boolean", "nil")
  2686. local magic, offset, size, encoding, sampleRate, channels = str_unpack(">c4IIIII", data)
  2687. if magic ~= ".snd" then error("invalid AU file", 2) end
  2688. if fn then
  2689. local first, f = str_sub(data, offset, size ~= 0xFFFFFFFF and offset + size - 1 or nil), nil
  2690. data = function()
  2691. if first then f, first = first return f
  2692. elseif ignoreHeader then
  2693. local d = fn()
  2694. if not d then return nil end
  2695. if d:match "^.snd" then return str_sub(d, str_unpack(">I", str_sub(d, 5, 8)), nil)
  2696. else return d end
  2697. else return fn() end
  2698. end
  2699. else data = str_sub(data, offset, size ~= 0xFFFFFFFF and offset + size - 1 or nil) end
  2700. if encoding == 1 then return aukit.stream.g711(data, true, channels, sampleRate, mono), size / channels / sampleRate
  2701. elseif encoding == 2 then return aukit.stream.pcm(data, 8, "signed", channels, sampleRate, true, mono), size / channels / sampleRate
  2702. elseif encoding == 3 then return aukit.stream.pcm(data, 16, "signed", channels, sampleRate, true, mono), size / channels / 2 / sampleRate
  2703. elseif encoding == 4 then return aukit.stream.pcm(data, 24, "signed", channels, sampleRate, true, mono), size / channels / 3 / sampleRate
  2704. elseif encoding == 5 then return aukit.stream.pcm(data, 32, "signed", channels, sampleRate, true, mono), size / channels / 4 / sampleRate
  2705. elseif encoding == 6 then return aukit.stream.pcm(data, 32, "float", channels, sampleRate, true, mono), size / channels / 4 / sampleRate
  2706. elseif encoding == 27 then return aukit.stream.g711(data, false, channels, sampleRate, mono), size / channels / sampleRate
  2707. else error("unsupported encoding type " .. encoding, 2) end
  2708. end
  2709.  
  2710. --- Returns an iterator to stream data from a FLAC file. Audio will automatically
  2711. --- be resampled to 48 kHz, and optionally mixed down to mono.
  2712. ---@param data string|function():string The FLAC file to decode, or a function
  2713. --- returning chunks to decode
  2714. ---@param mono? boolean Whether to mix the audio to mono
  2715. ---@return fun():number[][]|nil,number|nil _ An iterator function that returns
  2716. --- chunks of each channel's data as arrays of signed 8-bit 48kHz PCM, as well as
  2717. --- the current position of the audio in seconds
  2718. ---@return number _ The total length of the audio in seconds
  2719. function aukit.stream.flac(data, mono)
  2720. expect(1, data, "string", "function")
  2721. expect(2, mono, "boolean", "nil")
  2722. local infn = false
  2723. if type(data) == "function" then data = setmetatable({str = "", fn = data, final = false, byte = function(self, start, e)
  2724. while start > #self.str do
  2725. infn = true
  2726. local d = self.fn()
  2727. infn = false
  2728. if not d then self.final = true return nil end
  2729. self.str = self.str .. d
  2730. end
  2731. while e and e > #self.str do
  2732. infn = true
  2733. local d = self.fn()
  2734. infn = false
  2735. if not d then self.final = true return nil end
  2736. self.str = self.str .. d
  2737. end
  2738. return str_byte(self.str, start, e)
  2739. end}, {__len = function(self) return self.final and #self.str or math.huge end}) end
  2740. local function saferesume(coro, ...)
  2741. local res = table_pack(coroutine.resume(coro, ...))
  2742. while res[1] and infn do res = table_pack(coroutine.resume(coro, coroutine.yield(table_unpack(res, 2, res.n)))) end
  2743. return table_unpack(res, 1, res.n)
  2744. end
  2745. local coro = coroutine.create(decodeFLAC)
  2746. local ok, sampleRate, len = saferesume(coro, data, coroutine.yield)
  2747. if not ok then error(sampleRate, 2) end
  2748. local pos = 0
  2749. local ratio = 48000 / sampleRate
  2750. local lp_alpha = 1 - math.exp(-(sampleRate / 96000) * 2 * math_pi)
  2751. local interp = interpolate[aukit.defaultInterpolation]
  2752. local last = {0, 0}
  2753. return function()
  2754. if coroutine.status(coro) == "dead" then return nil end
  2755. local chunk = {{}}
  2756. while #chunk[1] < sampleRate do
  2757. local ok, res = saferesume(coro)
  2758. if not ok or res == nil or res.sampleRate then break end
  2759. os_queueEvent("nosleep")
  2760. repeat until "nosleep" == os_pullEvent()
  2761. for c = 1, #res do
  2762. chunk[c] = chunk[c] or {}
  2763. local src, dest = res[c], chunk[c]
  2764. local start = #dest
  2765. src[0] = last[2]
  2766. src[-1] = last[1]
  2767. local ls = last[2] / (last[2] < 0 and 128 or 127)
  2768. for i = 1, math_floor(#src * ratio) do
  2769. local d = start+i
  2770. local x = ((i - 1) / ratio) + 1
  2771. local s
  2772. if x % 1 == 0 then s = src[x]
  2773. else s = interp(src, x) end
  2774. s = ls + lp_alpha * (s - ls)
  2775. ls = s
  2776. dest[d] = clamp(s * (s < 0 and 128 or 127), -128, 127)
  2777. end
  2778. last = {src[#src-1], src[#src]}
  2779. end
  2780. os_queueEvent("nosleep")
  2781. repeat until "nosleep" == os_pullEvent()
  2782. end
  2783. pos = pos + #chunk[1] / 48000
  2784. return chunk, pos
  2785. end, len / sampleRate
  2786. end
  2787.  
  2788. --[[
  2789. .########.########.########.########..######..########..######.
  2790. .##.......##.......##.......##.......##....##....##....##....##
  2791. .##.......##.......##.......##.......##..........##....##......
  2792. .######...######...######...######...##..........##.....######.
  2793. .##.......##.......##.......##.......##..........##..........##
  2794. .##.......##.......##.......##.......##....##....##....##....##
  2795. .########.##.......##.......########..######.....##.....######.
  2796. ]]
  2797.  
  2798. --- aukit.effects
  2799. ---@section aukit.effects
  2800.  
  2801. --- Amplifies the audio by the multiplier specified.
  2802. ---@param audio Audio The audio to modify
  2803. ---@param multiplier number The multiplier to apply
  2804. ---@return Audio _ The audio modified
  2805. function aukit.effects.amplify(audio, multiplier)
  2806. expectAudio(1, audio)
  2807. expect(2, multiplier, "number")
  2808. if multiplier == 1 then return audio end
  2809. local start = os_epoch "utc"
  2810. for c = 1, #audio.data do
  2811. local ch = audio.data[c]
  2812. for i = 1, #ch do
  2813. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  2814. ch[i] = clamp(ch[i] * multiplier, -1, 1)
  2815. end
  2816. end
  2817. return audio
  2818. end
  2819.  
  2820. --- Changes the speed and pitch of audio by a multiplier, resampling to keep the
  2821. --- same sample rate.
  2822. ---@param audio Audio The audio to modify
  2823. ---@param multiplier number The multiplier to apply
  2824. ---@return Audio _ The audio modified
  2825. function aukit.effects.speed(audio, multiplier)
  2826. expectAudio(1, audio)
  2827. expect(2, multiplier, "number")
  2828. if multiplier == 1 then return audio end
  2829. local rate = audio.sampleRate
  2830. audio.sampleRate = audio.sampleRate * multiplier
  2831. local new = audio:resample(rate)
  2832. audio.sampleRate, audio.data = rate, new.data
  2833. return audio
  2834. end
  2835.  
  2836. --- Fades a period of music from one amplitude to another.
  2837. ---@param audio Audio The audio to modify
  2838. ---@param startTime number The start time of the fade, in seconds
  2839. ---@param startAmplitude number The amplitude of the beginning of the fade
  2840. ---@param endTime number The end time of the fade, in seconds
  2841. ---@param endAmplitude number The amplitude of the end of the fade
  2842. ---@return Audio _ The audio modified
  2843. function aukit.effects.fade(audio, startTime, startAmplitude, endTime, endAmplitude)
  2844. expectAudio(1, audio)
  2845. expect(2, startTime, "number")
  2846. expect(3, startAmplitude, "number")
  2847. expect(4, endTime, "number")
  2848. expect(5, endAmplitude, "number")
  2849. if startAmplitude == 1 and endAmplitude == 1 then return audio end
  2850. local startt = os_epoch "utc"
  2851. for c = 1, #audio.data do
  2852. local ch = audio.data[c]
  2853. local start = startTime * audio.sampleRate
  2854. local m = (endAmplitude - startAmplitude) / ((endTime - startTime) * audio.sampleRate)
  2855. for i = start, endTime * audio.sampleRate do
  2856. if os_epoch "utc" - startt > 5000 then startt = os_epoch "utc" sleep(0) end
  2857. ch[i] = clamp(ch[i] * (m * (i - start) + startAmplitude), -1, 1)
  2858. end
  2859. end
  2860. return audio
  2861. end
  2862.  
  2863. --- Inverts all channels in the specified audio.
  2864. ---@param audio Audio The audio to modify
  2865. ---@return Audio _ The audio modified
  2866. function aukit.effects.invert(audio)
  2867. expectAudio(1, audio)
  2868. for c = 1, #audio.data do
  2869. local ch = audio.data[c]
  2870. for i = 1, #ch do ch[i] = -ch[i] end
  2871. end
  2872. return audio
  2873. end
  2874.  
  2875. --- Normalizes audio to the specified peak amplitude.
  2876. ---@param audio Audio The audio to modify
  2877. ---@param peakAmplitude? number The maximum amplitude
  2878. ---@param independent? boolean Whether to normalize each channel independently
  2879. ---@return Audio _ The audio modified
  2880. function aukit.effects.normalize(audio, peakAmplitude, independent)
  2881. expectAudio(1, audio)
  2882. peakAmplitude = expect(2, peakAmplitude, "number", "nil") or 1
  2883. expect(3, independent, "boolean", "nil")
  2884. local mult
  2885. local start, sampleRate = os_epoch "utc", audio.sampleRate
  2886. if not independent then
  2887. local max = 0
  2888. for c = 1, #audio.data do
  2889. local ch = audio.data[c]
  2890. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  2891. for i = 1, #ch do max = math_max(max, math_abs(ch[i])) end
  2892. end
  2893. mult = peakAmplitude / max
  2894. end
  2895. for c = 1, #audio.data do
  2896. local ch = audio.data[c]
  2897. if independent then
  2898. local max = 0
  2899. for i = 1, #ch do max = math_max(max, math_abs(ch[i])) end
  2900. mult = peakAmplitude / max
  2901. end
  2902. if os_epoch "utc" - start > 3000 then start = os_epoch "utc" sleep(0) end
  2903. for i = 1, #ch do
  2904. ch[i] = clamp(ch[i] * mult, -1, 1)
  2905. end
  2906. end
  2907. return audio
  2908. end
  2909.  
  2910. --- Centers the DC offset of each channel.
  2911. ---@param audio Audio The audio to modify
  2912. ---@return Audio _ The audio modified
  2913. function aukit.effects.center(audio)
  2914. expectAudio(1, audio)
  2915. for c = 1, #audio.data do
  2916. local ch = audio.data[c]
  2917. for i = 0, #ch - 1, audio.sampleRate do
  2918. local avg = 0
  2919. local l = math_min(#ch - i, audio.sampleRate)
  2920. for j = 1, l do avg = avg + ch[i+j] end
  2921. avg = avg / l
  2922. for j = 1, l do ch[i+j] = clamp(ch[i+j] - avg, -1, 1) end
  2923. end
  2924. end
  2925. return audio
  2926. end
  2927.  
  2928. --- Trims any extra silence on either end of the specified audio.
  2929. ---@param audio Audio The audio to modify
  2930. ---@param threshold? number The maximum value to register as silence
  2931. ---@return Audio _ The audio modified
  2932. function aukit.effects.trim(audio, threshold)
  2933. expectAudio(1, audio)
  2934. threshold = expect(2, threshold, "number", "nil") or (1/65536)
  2935. local s, e
  2936. for i = 1, #audio.data[1] do
  2937. for c = 1, #audio.data do if math_abs(audio.data[c][i]) > threshold then s = i break end end
  2938. if s then break end
  2939. end
  2940. for i = #audio.data[1], 1, -1 do
  2941. for c = 1, #audio.data do if math_abs(audio.data[c][i]) > threshold then e = i break end end
  2942. if e then break end
  2943. end
  2944. local new = str_sub(audio, s / audio.sampleRate, e / audio.sampleRate)
  2945. audio.data = new.data
  2946. return audio
  2947. end
  2948.  
  2949. --- Adds a delay to the specified audio.
  2950. ---@param audio Audio The audio to modify
  2951. ---@param delay number The amount of time to delay for, in seconds
  2952. ---@param multiplier? number The multiplier to apply to the delayed audio
  2953. ---@return Audio _ The audio modified
  2954. function aukit.effects.delay(audio, delay, multiplier)
  2955. expectAudio(1, audio)
  2956. expect(2, delay, "number")
  2957. multiplier = expect(3, multiplier, "number", "nil") or 0.5
  2958. local samples = math_floor(delay * audio.sampleRate)
  2959. for c = 1, #audio.data do
  2960. local original = {}
  2961. local o = audio.data[c]
  2962. for i = 1, #o do original[i] = o[i] end
  2963. for i = samples + 1, #o do o[i] = clamp(o[i] + original[i - samples] * multiplier, -1, 1) end
  2964. end
  2965. return audio
  2966. end
  2967.  
  2968. --- Adds an echo to the specified audio.
  2969. ---@param audio Audio The audio to modify
  2970. ---@param delay? number The amount of time to echo after, in seconds
  2971. ---@param multiplier? number The decay multiplier to apply to the echoed audio
  2972. ---@return Audio _ The audio modified
  2973. function aukit.effects.echo(audio, delay, multiplier)
  2974. expectAudio(1, audio)
  2975. delay = expect(2, delay, "number", "nil") or 1
  2976. multiplier = expect(3, multiplier, "number", "nil") or 0.5
  2977. local samples = math_floor(delay * audio.sampleRate)
  2978. for c = 1, #audio.data do
  2979. local o = audio.data[c]
  2980. for i = samples + 1, #o do o[i] = clamp(o[i] + o[i - samples] * multiplier, -1, 1) end
  2981. end
  2982. return audio
  2983. end
  2984.  
  2985. local combDelayShift = {0, -11.73, 19.31, -7.97}
  2986. local combDecayShift = {0, 0.1313, 0.2743, 0.31}
  2987.  
  2988. --- Adds reverb to the specified audio.
  2989. ---@param audio Audio The audio to modify
  2990. ---@param delay? number The amount of time to reverb after, in **milliseconds**
  2991. ---@param decay? number The decay factor to use
  2992. ---@param wetMultiplier? number The wet (reverbed) mix amount
  2993. ---@param dryMultiplier? number The dry (original) mix amount
  2994. ---@return Audio _ The audio modified
  2995. function aukit.effects.reverb(audio, delay, decay, wetMultiplier, dryMultiplier)
  2996. expectAudio(1, audio)
  2997. delay = expect(2, delay, "number", "nil") or 100
  2998. decay = expect(3, decay, "number", "nil") or 0.3
  2999. wetMultiplier = expect(4, wetMultiplier, "number", "nil") or 1
  3000. dryMultiplier = expect(5, dryMultiplier, "number", "nil") or 0
  3001. for c = 1, #audio.data do
  3002. -- four comb filters
  3003. local sum = {}
  3004. local o = audio.data[c]
  3005. for n = 1, 4 do
  3006. local comb = {}
  3007. local samples = math_floor((delay + combDelayShift[n]) / 1000 * audio.sampleRate)
  3008. local multiplier = decay - combDecayShift[n]
  3009. for i = 1, math_min(samples, #o) do
  3010. comb[i] = o[i]
  3011. sum[i] = (sum[i] or 0) + o[i]
  3012. end
  3013. for i = samples + 1, #o do
  3014. local s = o[i] + comb[i - samples] * multiplier
  3015. comb[i] = s
  3016. sum[i] = (sum[i] or 0) + s
  3017. end
  3018. end
  3019. -- mix wet/dry
  3020. for i = 1, #sum do sum[i] = sum[i] * wetMultiplier + o[i] * dryMultiplier end
  3021. -- two all pass filters
  3022. local samples = math_floor(0.08927 * audio.sampleRate)
  3023. sum[samples+1] = sum[samples+1] - 0.131 * sum[1]
  3024. for i = samples + 2, #sum do sum[i] = sum[i] - 0.131 * sum[i - samples] + 0.131 * sum[i + 20 - samples] end
  3025. o[samples+1] = clamp(sum[samples+1] - 0.131 * sum[1], -1, 1)
  3026. for i = samples + 2, #sum do o[i] = clamp(sum[i] - 0.131 * sum[i - samples] + 0.131 * sum[i + 20 - samples], -1, 1) end
  3027. end
  3028. return audio
  3029. end
  3030.  
  3031. --- Applies a low-pass filter to the specified audio.
  3032. ---@param audio Audio The audio to modify
  3033. ---@param frequency number The cutoff frequency for the filter
  3034. ---@return Audio _ The audio modified
  3035. function aukit.effects.lowpass(audio, frequency)
  3036. expectAudio(1, audio)
  3037. expect(2, frequency, "number")
  3038. local a = 1 - math.exp(-(frequency / audio.sampleRate) * 2 * math_pi)
  3039. for c = 1, #audio.data do
  3040. local d = audio.data[c]
  3041. for i = 2, #d do
  3042. local l = d[i-1]
  3043. d[i] = l + a * (d[i] - l)
  3044. end
  3045. end
  3046. return audio
  3047. end
  3048.  
  3049. --- Applies a high-pass filter to the specified audio.
  3050. ---@param audio Audio The audio to modify
  3051. ---@param frequency number The cutoff frequency for the filter
  3052. ---@return Audio _ The audio modified
  3053. function aukit.effects.highpass(audio, frequency)
  3054. expectAudio(1, audio)
  3055. expect(2, frequency, "number")
  3056. local a = 1 / (2 * math_pi * (frequency / audio.sampleRate) + 1)
  3057. for c = 1, #audio.data do
  3058. local d = audio.data[c]
  3059. local lx = d[1]
  3060. for i = 2, #d do
  3061. local llx = d[i]
  3062. d[i] = a * (d[i-1] + llx - lx)
  3063. lx = llx
  3064. end
  3065. end
  3066. return audio
  3067. end
  3068.  
  3069. return aukit
Add Comment
Please, Sign In to add comment