Dissociated Press
Posted: November 26, 2007 Filed under: plsql Leave a comment »There is this nice Emacs game where you take some random text and use random walk to generate a somewhat similar but completely random text out of it. It is called “Dissociated Press“.
Out of sheer randomness, I’ve implemented this in Oracle.
Notes:
- I’m using varchar to hold the text, which limit us to short and therefore uninteresting texts. CLOB would be better.
- If there is any serious amount of text involved, you’ll want to create a non-unique index on the first two columns of the markov table after you populate the table, and gather statistics on that.
- Feel free to criticize my code. That is why I post it. But note that I’ve broken some lines so it will look nice in the blog.
- Also feel free to rewrite the entire thing with a single model statement.
- The algorithm should be pretty straightforward, but you can read about a similar implementation here.
CREATE TABLE markov
(c1 VARCHAR(30),
c2 VARCHAR(30),
c3 VARCHAR(30));
DECLARE
cnt BINARY_INTEGER;
i BINARY_INTEGER;
my_str VARCHAR(4000) := 'A bug that has been documented.
To call something a feature sometimes means the author of the
program did not consider the particular case, and that the program
responded in a way that was unexpected but not strictly incorrect.
A standard joke is that a bug can be turned into a feature simply
by documenting it (then theoretically no one can complain about
it because its in the manual), or even by simply declaring it to
be good. Thats not a bug, thats a feature is a common
catchphrase.';
my_comma_str VARCHAR(4000);
my_table dbms_utility.uncl_array;
num_itr INTEGER := 50;
tmp_c1 VARCHAR(30);
tmp_c2 VARCHAR(30);
tmp_c3 VARCHAR(30);
BEGIN
select REPLACE('"' ||
REPLACE(my_str,' ','","') || '"',',"",',',')
INTO my_comma_str FROM dual;
dbms_utility.comma_to_table(my_comma_str, cnt, my_table);
FOR i in 3..cnt LOOP
INSERT INTO markov (c1,c2,c3) VALUES
(replace(my_table(i-2),'"',''),
replace(my_table(i-1),'"',''),replace(my_table(i),'"',''));
END LOOP;
COMMIT;
SELECT c1,c2,c3 INTO tmp_c1,tmp_c2,tmp_c3 FROM markov
WHERE ROWNUM=1;
DBMS_OUTPUT.put_line(tmp_c1 || ' ' || tmp_c2 || ' ');
FOR i IN 1..num_itr loop
SELECT c1,c2,c3 INTO tmp_c1,tmp_c2,tmp_c3 from
(SELECT * FROM markov WHERE c1=tmp_c2 AND
c2=tmp_c3 ORDER BY dbms_random.VALUE) t
WHERE ROWNUM=1;
DBMS_OUTPUT.put_line(tmp_c2 || ' ');
END LOOP;
END;
**
This post has been partially inspired by this one.
