Noise

Noise for the QiN step

After standardizing the sampling frequency, channels and format of the AudioSet files. Single-channel noise is obtained from 32 10-second samples of the following 90 classes:

[‘Drill’, ‘Truck’, ‘Cheering’, ‘Tools’, ‘Civil defense siren’, ‘Police car (siren)’, ‘Helicopter’, ‘Vibration’, ‘Drum kit’, ‘Telephone bell ringing’, ‘Drum roll’, ‘Waves, surf’, ‘Emergency vehicle’, ‘Siren’, ‘Aircraft engine’, ‘Idling’, ‘Fixed-wing aircraft, airplane’, ‘Vehicle horn, car horn, honking’, ‘Jet engine’, ‘Light engine (high frequency)’, ‘Heavy engine (low frequency)’, ‘Engine knocking’, ‘Engine starting’, ‘Motorboat, speedboat’, ‘Motor vehicle (road)’, ‘Motorcycle’, ‘Boat, Water vehicle’, ‘Fireworks’, ‘Stream’, ‘Train horn’, ‘Foghorn’, ‘Chainsaw’, ‘Wind noise (microphone)’, ‘Wind’, ‘Traffic noise, roadway noise’, ‘Environmental noise’, ‘Race car, auto racing’, ‘Railroad car, train wagon’, ‘Scratching (performance technique)’, ‘Vacuum cleaner’, ‘Tubular bells’, ‘Church bell’, ‘Jingle bell’, ‘Car alarm’, ‘Car passing by’, ‘Alarm’, ‘Alarm clock’, ‘Smoke detector, smoke alarm’, ‘Fire alarm’, ‘Thunderstorm’, ‘Hammer’, ‘Jackhammer’, ‘Steam whistle’, ‘Distortion’, ‘Air brake’, ‘Sewing machine’, ‘Applause’, ‘Drum machine’, “Dental drill, dentist’s drill”, ‘Gunshot, gunfire’, ‘Machine gun’, ‘Cap gun’, ‘Bee, wasp, etc.’, ‘Beep, bleep’, ‘Frying (food)’, ‘Sampler’, ‘Meow’, ‘Toilet flush’, ‘Whistling’, ‘Glass’, ‘Coo’, ‘Mechanisms’, ‘Rub’, ‘Boom’, ‘Frog’, ‘Coin (dropping)’, ‘Crowd’, ‘Crackle’, ‘Theremin’, ‘Whoosh, swoosh, swish’, ‘Raindrop’, ‘Engine’, ‘Rail transport’, ‘Vehicle’, ‘Drum’, ‘Car’, ‘Animal’, ‘Inside, small room’, ‘Laughter’, ‘Train’]

This represents 8 hours of audio.

Loudness normalization

from ffmpeg_normalize import FFmpegNormalize

normalizer = FFmpegNormalize(normalization_type="ebu",
                            target_level = -15.0,
                            loudness_range_target=5,
                            true_peak = -2,
                            dynamic = True,
                            print_stats=False,
                            sample_rate = 48_000,
                            progress=True)

normalizer.add_media_file(input_file='tot.rf64',
                          output_file='tot_normalized.wav')
normalizer.run_normalization()

Spatialization

The direction of sound is sampled uniformly on the unit sphere using the inverse cumulative distribution function.

Noise for the SiN step

The noise used for the final stage of the recording was captured with a ZYLIA ZR-1 Portable. It consists of applause, demonstrations and opera for a total of 2h40.