All posts by nouyang

Pandemic Diary #12, 03 Jun 2020

quite quite stressed.

feelings: ignorant for wanting to understand police side more by default, guilty for not reading up on BLM before, stressed about COVID and protests, for talking about my own experiences, weird to be the one bringing it up in group slack?, confused that i am not instant deep friends with labmates (or something lol), useless for not completing any coding for my class this semester, stressed to update my portfolio to prove I can code, bought a ebook online for the first time (anti racism), upset that my parents are going to stores more regularly, stressed about them buying a used laundry machine that leaks, that my dad never even puts his dish in the sink after he’s done eating, that i’m not learning chinese efficiently enough watching tv dramas to be worth the time, that i’ve wasted too much time in my life, that i’m not hardworking enough and would fail out as an immigrant (well, that’s definitely the case), reflecting that i’ve come a long way since i was out-of-control depressed, wanting to do startups / hack on code, feeling like i know how to write bash scripts but not real large code projects, simultaneously feel frustrated to be working with people who don’t know how to version control, use branches, and do peer review, stressed that my cat is possibly angry at me for reducing calories again, not sure if i’ll ever get her happy to use the harness, bought cat and sewing machine things on amazon that weren’t at all essential and feel bad about that, stressed about trying to keep up with ICRA, miserable about having only one paper after three years, miserable to not be doing any COVID activism and keeping up with the news, to worry about my friends and wonder proactively try to feel safe to talk to or i should ignore them, guilty that i am not doing more of the cooking and chores around the house

i think the cat is definitely angry at me, the mews sound different

maybe that is all?

saw lightning bugs today!! up to ep 18 on the oh my general drama. uncomfortable: gender swap, but is it feminist?

icra conference – sourcing digit.ml – label images – run yolo – paper

ICRA happy hour, losing steam…

Pandemic Diary #11, 02 Jun 2020

hmm

Rosie’s (cat) weight:

Jun 02, 13.0 lbs exactly.
May 10, 13.2 lbs.
May 02, 13.5 lbs .
Apr 27, 13.7 lbs.

Despite changing from 200 cal down to 150 cal, almost a week ago, little progress. Current target calories was: 150. (90 from wet food, 50 from dry, 10 from treats)

New target calories: 90 wet food, 30 dry, 10 from treats.

 

Harness arrived. Serendipitiously, it’s the same color as the paracord she loves playing with. (Okay, to be fair, I used my favorite color for both).

Town Announcer NPC for Online Town (ICRA2020 virtual happy hour)

Made a quick NPC announcer for this virtual pokemon style gathering tech (aka Online Town)

Online Town was used for a previous conference, ICLR, where people even “went to the beach” (there’s several environments, I chose the conference hall one).

ICLR Town: Pokemon-esque environment to wander around and bump into people, which syncs almost seamlessly with video-chatting capabilities. — maithra_raghu

However there is no chat history, and no way to set a description or announcements. In order to do so, Ondrej Biza suggested I actually just have a repeating audio announcement character: the Town Announcer.

After some googling, I quickly whipped together a set of bash scripts to do so. This is is Ubuntu 18.04.

1. Create recording of the announcement

I used `festival`, since I already had that installed for my terminal timer (blinks the terminal red, plays a loud sound, and speaks “n minutes are up”).

sudo apt install festival
text2wave onlinetown.txt | lame - text.mp3

Inside the text file I had

$ vi onlinetown.txt

Welcome! The next scheduled event is: Happy Hour at 5pm Eastern. Again that
is at 5pm Eastern. Move away
to stop hearing this voiceover. To add to this voiceover email me at a b c at m i t dot edu

(Side note: Somehow festival is really bad at pronouncing email, oh well).

2. Create “virtual microphone”

I can then select this as my mic input for the videoconference

As per sebpiq on stackoverflow

pactl load-module module-pipe-source source_name=virtmic file=/tmp/virtmic format=s16le rate=16000 channels=1

3. Play speed recording over virtual mic

As per sebpiq on stackoverflow

ffmpeg -re -i text.mp3 -f s16le -ar 16000 -ac 1 - > /tmp/virtmic

4. Create bash script to continuously do so

$ vi loop_mic.sh

#!/bin/bash
while true
do 
echo "Press [CTRL+C] to stop.."
ffmpeg -re -i text.mp3 -f s16le -ar 16000 -ac 1 - > /tmp/virtmic
sleep 0.5
done

5. Run it

chmod +x loop_mic.sh
sh loop_mic.sh

Check that it’s in the settings

Then open online.town, and when it asks for which microphone input to select, select the Unix FIFO.

To check that it’s all working, I joined from another computer & could hear the announcer. Sometimes it does seem like I can’t hear the announcer when joining, but I can’t reliably reproduce it.

Another sanity check is to go back to Sound settings and check that the “input level” bars are going up and down.

Appendix

I did try speeding up the audio, but it did introduce new audio artifacts (squeaks). If you want to do so, add a setting when generating the mp3. Below it speeds up to 2x.

 text2wave -eval "(Parameter.set 'Duration_Stretch 0.5)" onlinetown.txt -o text.mp3

( I also tried speeding it up in the `ffmpeg` command, it sounded horrible).

Better text to speech

This was a pain… ! Not sure why. Very confusing instructions.

Helped in part As per. https://levelup.gitconnected.com/installing-mozilla-tts-on-a-raspberry-pi-4-e6af16459ab9

These are final records of what worked.

 2022  apt-get install -y espeak libsndfile1 python3-venv
 0000 python3 -m venv env
 0000 source env/bin/activate
 2029  pip3 install -U pip setuptools wheel
 2050  pip install https://github.com/reuben/TTS/releases/download/ljspeech-fwd-attn-pwgan/TTS-0.0.1+92aea2a-py3-none-any.whl
 2001 git clone git@github.com:mozilla/TTS.git
 0000 cd TTS
 2053 pip3 install packaging
 2052 python -m TTS.server.server 
 0000 firefox localhost:5002
# FAILS unless inside TTS git folder!

Traceback from failure:

# FAILURE CASE
$ python -m TTS.server.server
> Loading TTS model ...
| > model config: None
| > checkpoint file: None
Traceback (most recent call last):
File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/chai/projects/TTS/server/server.py", line 62, in <module>
synthesizer = Synthesizer(args)
File "/home/chai/projects/TTS/server/synthesizer.py", line 36, in __init__
self.config.use_cuda)
File "/home/chai/projects/TTS/server/synthesizer.py", line 52, in load_tts
self.tts_config = load_config(tts_config)
File "/home/chai/projects/TTS/utils/io.py", line 16, in load_config
with open(config_path, "r") as f:
TypeError: expected str, bytes or os.PathLike object, not NoneType

Some other commands I tried (didn’t work)

https://github.com/mozilla/TTS/wiki/Released-Models#simple-packaging---self-contained-package-that-runs-an-http-api-for-a-pre-trained-tts-model

sudo apt-get install -y espeak libsndfile1 python3-venv
python3 -m venv env
source env/bin/activate && which python3
pip install -U https//example.com/url/to/python/package.whl
pip3 install -U pip setuptools wheel
$ python -m TTS.server.server --help

It will say synthesizing, After a minute or two it will finish and start playing the audio. Right click and “save audio”

Output is a wav file.
What successful TTS.server.server looks like:

(env) 15:53:25 chai@W530:~/projects/TTS (master %)$ python -m TTS.server.server
/home/chai/projects/env/lib/python3.6/site-packages/librosa/util/decorators.py:9: NumbaDeprecationWarning: An import was requested from a module that has moved location.
Import of 'jit' requested from: 'numba.decorators', please update to use 'numba.core.decorators' or pin to Numba version 0.48.0. This alias will not be present in Numba version 0.50.0.
  from numba.decorators import jit as optional_jit
 > Loading TTS model ...
 | > model config:  /home/chai/projects/env/lib/python3.6/site-packages/TTS/server/model/tts/config.json
 | > checkpoint file:  /home/chai/projects/env/lib/python3.6/site-packages/TTS/server/model/tts/checkpoint.pth.tar
 > Setting up Audio Processor...
 | > sample_rate:22050
 | > num_mels:80
 | > min_level_db:-100
 | > frame_shift_ms:12.5
 | > frame_length_ms:50
 | > ref_level_db:20
 | > num_freq:1025
 | > power:1.5
 | > preemphasis:0.98
 | > griffin_lim_iters:60
 | > signal_norm:True
 | > symmetric_norm:True
 | > mel_fmin:0
 | > mel_fmax:8000.0
 | > max_norm:4.0
 | > clip_norm:True
 | > do_trim_silence:True
 | > trim_db:60
 | > sound_norm:False
 | > n_fft:2048
 | > hop_length:275
 | > win_length:1100
 > Using model: Tacotron2
 > Loading PWGAN model ...
 | > model config:  /home/chai/projects/env/lib/python3.6/site-packages/TTS/server/model/pwgan/config.yml
 | > model file:  /home/chai/projects/env/lib/python3.6/site-packages/TTS/server/model/pwgan/checkpoint.pkl
 > Setting up Audio Processor...
 | > sample_rate:22050
 | > num_mels:80
 | > min_level_db:-100
 | > frame_shift_ms:12.5
 | > frame_length_ms:50
 | > ref_level_db:20
 | > num_freq:1025
 | > power:None
 | > preemphasis:0.98
 | > griffin_lim_iters:None
 | > signal_norm:True
 | > symmetric_norm:True
 | > mel_fmin:0
 | > mel_fmax:8000.0
 | > max_norm:4.0
 | > clip_norm:True
 | > do_trim_silence:True
 | > trim_db:60
 | > sound_norm:False
 | > n_fft:2048
 | > hop_length:275
 | > win_length:1100
 * Serving Flask app "server" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
[INFO]  * Running on http://0.0.0.0:5002/ (Press CTRL+C to quit)
 > Model input: Welcome! The next scheduled event is: Happy Hour at 5pm Eastern. Again that is at 5pm Eastern. Move away to stop hearing this voiceover.
['Welcome!', 'The next scheduled event is: Happy Hour at 5pm Eastern.', 'Again that is at 5pm Eastern.', 'Move away to stop hearing this voiceover.']
[INFO] 127.0.0.1 - - [01/Jun/2020 15:54:03] "GET /api/tts?text=Welcome!%20The%20next%20scheduled%20event%20is%3A%20Happy%20Hour%20at%205pm%20Eastern.%20Again%20that%20is%20at%205pm%20Eastern.%20Move%20away%20to%20stop%20hearing%20this%20voiceover. HTTP/1.1" 200 -

Hurray. ! The output sounds much nicer.

 

Note:

Undo with

pactl unload-module module-pipe-source